MOSS-TTSD: Text to Spoken Dialogue Generation
Transcribe audio from files, microphone, or YouTube
Generate personalized images with a face preservation