LTX 2.3 First-Last Frame
Generate video with audio from text and optional frames
A cutting-edge speech generation model with stereo support
Controllable TTS via instruction prompting (JPN / Anime)
FireRed-Image-Edit Γ Qwen-Image-Edit-Rapid (Transformers)
FireRed-OCR for Document Recognition
Generate spoken audio from text with custom or cloned voices
Music Generation Foundation Model v1.5
Official Playground of Microsoft VibeVoice-ASR
Transcribe audio to text with multi-language timestamps
Powered by SeedVR2 from (ByteDance Seed)