Audio-Aware Large Language Models as Judges for Speaking Styles Paper • 2506.05984 • Published 6 days ago • 14
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS Paper • 2406.18009 • Published Jun 26, 2024 • 23
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS Paper • 2406.18009 • Published Jun 26, 2024 • 23
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like Paper • 2402.07383 • Published Feb 12, 2024 • 16
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription Paper • 2401.08887 • Published Jan 16, 2024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering Paper • 2401.07333 • Published Jan 14, 2024
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like Paper • 2402.07383 • Published Feb 12, 2024 • 16
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer Paper • 2308.06873 • Published Aug 14, 2023 • 27
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer Paper • 2308.06873 • Published Aug 14, 2023 • 27