microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 1 day ago • 231k • 1.03k
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 43 • 14
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 43 • 14
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 43 • 14
Running on A10G 304 304 AudioLDM2 Text2Audio Text2Music Generation 🔊 Generate a video waveform from text-based audio descriptions