--- license: apache-2.0 datasets: - simon3000/genshin-voice language: - zh base_model: - SparkAudio/Spark-TTS-0.5B pipeline_tag: text-to-speech tags: - chinese - spark-tts - genshin - float16 --- # Spark TTS finetuned in genshin charactors voices. * github code: https://github.com/nonwesjoe/genshin-sparktts * kaggle notebook: https://www.kaggle.com/code/suziwsz/genshin-sparktts/ ## Available charactors * paimon, hutao, furina, kazuha, xiao, mona, ganyu, xiangling, shotgun, citlali, barbara, zhongli, venti, nahida, kaeya, yaoyao, yoimiya, nilou.(each charactor in one full finetuned model) # Usage * python 3.12 suggested * git clone https://github.com/nonwesjoe/genshin-sparktts.git && cd genshin-sparktts * when cuda is availabel, install torch 2.7.1 on cuda pip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cu118/ else, install torch 2.7.1 on cpu pip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cpu * install other requirements pip install -r requirements.txt * in terminal set some environment variables ``` export CHARACTOR=nahida # or other charactors export MODEL_PATH=/kaggle/working/genshin/ # your model path export INPUT_TEXT="楼下发荔枝了吗?那我们快去领取!" # text to be converted ``` * download model files: defaultly, download one specific charactor model set in environment variable CHARACTOR. model will be download in ./genshin python3 download.py * run code to convert text to audio. audio outputs sparktts.wav. python3 run.py # Detail * this model is trianed on float32 but saved as float16 for less VRAM and Storage usage. # Example [▶ Furina(芙宁娜) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/furina.wav) [▶ Kazuha(万叶) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/kazuha.wav) [▶ Paimon(派蒙) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/paimon.wav) [▶ Hutao(胡桃) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/hutao.wav) [▶ Xiao(魈) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/xiao.wav) [▶ Citlali(茜特菈莉) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/citlali.wav) # Acknowledgement * [Spark-TTS](https://github.com/SparkAudio/Spark-TTS) * [Genshin dataset](https://huggingface.co/datasets/simon3000/genshin-voice) * [Unsloth](https://github.com/unslothai/unsloth)