Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization Paper • 2404.09956 • Published Apr 15, 2024 • 12
CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models Paper • 2402.15021 • Published Feb 22, 2024 • 13