MerlinLi
's Collections
text-to-speech
updated
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
•
2404.14700
•
Published
•
32
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Paper
•
2306.15687
•
Published
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
Paper
•
2403.03100
•
Published
•
36
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through
Direct Preference Optimization
Paper
•
2404.09956
•
Published
•
12
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech
Prompts
Paper
•
2307.07218
•
Published
•
27
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
Bias
Paper
•
2306.03509
•
Published
•
5
parler-tts/dac_44khZ_8kbps
Updated
•
818
•
17
parler-tts/parler_tts_mini_v0.1
Text-to-Speech
•
Updated
•
9.8k
•
349
Wenetspeech4TTS/WenetSpeech4TTS
Updated
•
1.7k
•
70
liuhuadai/AudioLCM
Text-to-Audio
•
Updated
•
7
•
7
kyutai/mimi
Feature Extraction
•
Updated
•
170k
•
108
hexgrad/Kokoro-82M
Text-to-Speech
•
Updated
•
1.54M
•
3.61k
HKUSTAudio/Llasa-3B
Text-to-Speech
•
Updated
•
3.66k
•
470
Zyphra/Zonos-v0.1-hybrid
Text-to-Speech
•
Updated
•
58.1k
•
1.04k
stepfun-ai/Step-Audio-TTS-3B
Text-to-Speech
•
Updated
•
1.97k
•
164