File size: 2,663 Bytes
1a9ccab 6b9da6d 1a9ccab f613218 090b66f 622fba6 d43b94d 0f57934 d43b94d 1d53741 0f57934 d43b94d 6b9da6d 59d3e03 622fba6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
license: apache-2.0
datasets:
- simon3000/genshin-voice
language:
- zh
base_model:
- SparkAudio/Spark-TTS-0.5B
pipeline_tag: text-to-speech
tags:
- chinese
- spark-tts
- genshin
- float16
---
# Spark TTS finetuned in genshin charactors voices.
* github code: https://github.com/nonwesjoe/genshin-sparktts
* kaggle notebook: https://www.kaggle.com/code/suziwsz/genshin-sparktts/
## Available charactors
* paimon, hutao, furina, kazuha, xiao, mona, ganyu, xiangling, shotgun, citlali, barbara, zhongli, venti, nahida, kaeya, yaoyao, yoimiya, nilou.(each charactor in one full finetuned model)
# Usage
* python 3.12 suggested
* git clone https://github.com/nonwesjoe/genshin-sparktts.git && cd genshin-sparktts
* when cuda is availabel, install torch 2.7.1 on cuda <code>pip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cu118/</code>
else, install torch 2.7.1 on cpu <code>pip install torch torchaudio torchvision -i https://download.pytorch.org/whl/cpu</code>
* install other requirements <code>pip install -r requirements.txt</code>
* in terminal set some environment variables
```
export CHARACTOR=nahida # or other charactors
export MODEL_PATH=/kaggle/working/genshin/ # your model path
export INPUT_TEXT="楼下发荔枝了吗?那我们快去领取!" # text to be converted
```
* download model files: defaultly, download one specific charactor model set in environment variable CHARACTOR. model will be download in ./genshin
<code>
python3 download.py
</code>
* run code to convert text to audio. audio outputs sparktts.wav.
<code>
python3 run.py
</code>
# Detail
* this model is trianed on float32 but saved as float16 for less VRAM and Storage usage.
# Example
[▶ Furina(芙宁娜) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/furina.wav)
[▶ Kazuha(万叶) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/kazuha.wav)
[▶ Paimon(派蒙) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/paimon.wav)
[▶ Hutao(胡桃) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/hutao.wav)
[▶ Xiao(魈) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/xiao.wav)
[▶ Citlali(茜特菈莉) play](https://raw.githubusercontent.com/nonwesjoe/genshin-sparktts/main/examples/citlali.wav)
# Acknowledgement
* [Spark-TTS](https://github.com/SparkAudio/Spark-TTS)
* [Genshin dataset](https://huggingface.co/datasets/simon3000/genshin-voice)
* [Unsloth](https://github.com/unslothai/unsloth) |