Generate realistic voice synthesis using text and reference audio
Generate music from text descriptions