--- license: mit datasets: - doof-ferb/infore1_25hours ---
## Introduction MeloTTS Vietnamese is a version of MeloTTS optimized for the Vietnamese language. This version inherits the high-quality characteristics of the original model but has been specially adjusted to work well with the Vietnamese language. ## Technical Features - Uses [underthesea](https://github.com/undertheseanlp/underthesea) for Vietnamese text segmentation - Integrates [PhoBert](https://github.com/VinAIResearch/PhoBERT) (vinai/phobert-base-v2) to extract Vietnamese language features - Fully supports Vietnamese language characteristics: - 45 symbols (phonemes) - 8 tones (7 tonal marks and 1 unmarked tone) - All defined in `melo/text/symbols.py` - Text-to-phoneme conversion source: - Based on [Text2PhonemeSequence](https://github.com/thelinhbkhn2014/Text2PhonemeSequence) library - An improved version with higher performance has been developed at [Text2PhonemeFast](https://github.com/manhcuong02/Text2PhonemeFast) ## Fine-tuning from Base Model This model was fine-tuned from the base MeloTTS model by: - Replacing phonemes not found in English and Vietnamese with Vietnamese phonemes - Specifically replacing Korean phonemes with corresponding Vietnamese phonemes - Adjusting parameters to match Vietnamese phonetic characteristics ## Training Data - The model was trained on the Infore dataset, consisting of approximately 25 hours of speech - Note on data quality: This dataset has several limitations including poor voice quality, lack of punctuation, and inaccurate phonetic transcriptions. However, when trained on internal data, the results were much better. ## Downloading the Model The pre-trained model can be downloaded from Hugging Face: - [MeloTTS Vietnamese on Hugging Face](https://huggingface.co/nmcuong/MeloTTS_Vietnamese) ## Usage Guide ### Data Preparation The data preparation process is detailed in `docs/training.md`. Basically, you need: - Audio files (recommended to use 44100Hz format) - Metadata file with the format: ``` path/to/audio_001.wav |