IlyaKalinovskiy
Add README
d9be2e0
metadata
license: apache-2.0

Multilingual forced alignment

These are forced alignment models at the phoneme level for the text-to-speech (TTS) task.

They also have high accuracy in localizing pauses in speech, which can be useful for training voice activity detection (VAD) models.

For documentation and usage examples, please refer to SpeechFlow project.

segmentation_example