LaViDa: A Large Diffusion Language Model for Multimodal Understanding Paper • 2505.16839 • Published 3 days ago • 10
LaViDa-1.0 Collection LArge VIsion-language Diffusion moDel with mAsking • 9 items • Updated 2 days ago • 4
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection Paper • 2503.12271 • Published Mar 15 • 9
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows Paper • 2412.01169 • Published Dec 2, 2024 • 13