README.md · IlyaKalinovskiy/multilingual-forced-alignment at main

metadata

license: apache-2.0

These are forced alignment models at the phoneme level for the text-to-speech (TTS) task.

They also have high accuracy in localizing pauses in speech, which can be useful for training voice activity detection (VAD) models.

For documentation and usage examples, please refer to SpeechFlow project.