BigVGAN Collection BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input. β’ 11 items β’ Updated 1 day ago β’ 11
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Paper β’ 2407.04051 β’ Published Jul 4, 2024 β’ 36
Standard-format-preference-dataset Collection We collect the open-source datasets and process them into the standard format. β’ 14 items β’ Updated May 8, 2024 β’ 23
Salesforce/xgen-mm-phi3-mini-instruct-r-v1 Image-Text-to-Text β’ Updated Sep 18, 2024 β’ 1.68k β’ 185