Sylvain Filoni PRO
fffiloni
AI & ML interests
ML for Animation β’ Alumni Arts DΓ©co Paris β’ PSL
Recent Activity
posted an
update
about 7 hours ago
A clearer demo for TADA (now multilingual) ππ
I improved the public demo for TADA β a generative framework for speech modeling via textβacoustic dual alignment.
TADA models speech as a joint sequence of text tokens and acoustic tokens, using a transformer backbone to keep text and audio synchronized during generation.
The original demo already exposed these mechanisms, but the workflow made the pipeline hard to understand.
This updated demo makes the process clearer:
β’ load the model
β’ prepare a reference voice (optionally with transcript or Whisper auto-transcription)
β’ generate speech conditioned on that reference
It also adds multilingual support.
Presets are included for a few languages, but the model supports more:
English, French, Spanish, German, Arabic, Mandarin Chinese, Italian, Japanese, Polish, Portuguese
Feel free to try different voices, accents, or languages and see how the alignment behaves.
π https://huggingface.co/spaces/fffiloni/tada-dual-alignment-tts-demo
Paper
https://huggingface.co/papers/2602.23068 updated
a collection
about 10 hours ago
My Spaces running on ZeroGPU updated
a Space about 10 hours ago
fffiloni/tada-dual-alignment-tts-demo