LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper β’ 2503.15264 β’ Published 4 days ago β’ 17
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper β’ 2503.09662 β’ Published 11 days ago β’ 30
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper β’ 2503.05978 β’ Published 15 days ago β’ 32
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation Paper β’ 2502.18302 β’ Published 26 days ago β’ 4
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper β’ 2502.17157 β’ Published 27 days ago β’ 51
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper β’ 2502.17258 β’ Published 27 days ago β’ 73
Dynamic Concepts Personalization from Single Videos Paper β’ 2502.14844 β’ Published about 1 month ago β’ 16
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper β’ 2502.14786 β’ Published about 1 month ago β’ 132
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper β’ 2502.09621 β’ Published Feb 13 β’ 27
Magic 1-For-1: Generating One Minute Video Clips within One Minute Paper β’ 2502.07701 β’ Published Feb 11 β’ 34
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper β’ 2502.04320 β’ Published Feb 6 β’ 35
Generating Multi-Image Synthetic Data for Text-to-Image Customization Paper β’ 2502.01720 β’ Published Feb 3 β’ 8
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper β’ 2502.02492 β’ Published Feb 4 β’ 62
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding Paper β’ 2501.16411 β’ Published Jan 27 β’ 18
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper β’ 2501.13926 β’ Published Jan 23 β’ 38
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning Paper β’ 2501.04698 β’ Published Jan 8 β’ 15