ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement Paper • 2504.01934 • Published 7 days ago • 20
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper • 2409.18042 • Published Sep 26, 2024 • 41