LaViDa: A Large Diffusion Language Model for Multimodal Understanding Paper • 2505.16839 • Published 3 days ago • 10 • 2
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection Paper • 2503.12271 • Published Mar 15 • 9 • 2
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows Paper • 2412.01169 • Published Dec 2, 2024 • 13 • 2