klaam

项目介绍：klaam

klaam 是一个专注于阿拉伯语的语音识别、分类和文本转语音（TTS）的项目。它利用了诸如 wave2vec 和 fastspeech2 等多种先进模型，为用户提供了训练与预测的功能。

语音分类

klaam 项目提供了一种简单的方式来进行语音分类。只需导入 klaam 中的 SpeechClassification，实例化一个模型对象，即可使用模型的 classify 方法对音频文件进行分类。

from klaam import SpeechClassification
model = SpeechClassification()
model.classify(wav_file)

语音识别

项目中提供了语音识别的功能，用户可以通过 SpeechRecognition 实例来转录音频文件。支持现代标准阿拉伯语（MSA）和埃及方言（EGY）的识别。

from klaam import SpeechRecognition
model = SpeechRecognition()
model.transcribe(wav_file)

文本转语音

klaam 的文本转语音功能基于 FastSpeech2 实现。用户需准备好配置文件路径，并实例化 TextToSpeech 对象，用以合成指定文本。

from klaam import TextToSpeech
prepare_tts_model_path = "../cfgs/FastSpeech2/config/Arabic/preprocess.yaml"
model_config_path = "../cfgs/FastSpeech2/config/Arabic/model.yaml"
train_config_path = "../cfgs/FastSpeech2/config/Arabic/train.yaml"
vocoder_config_path = "../cfgs/FastSpeech2/model_config/hifigan/config.json"
speaker_pre_trained_path = "../data/model_weights/hifigan/generator_universal.pth.tar"

model = TextToSpeech(prepare_tts_model_path, model_config_path, train_config_path, vocoder_config_path, speaker_pre_trained_path)
model.synthesize(sample_text)