Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion Paper • 2512.04926 • Published about 1 month ago • 41
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published Nov 17, 2025 • 67
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 211
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published Aug 28, 2025 • 110
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2, 2025 • 238
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity Paper • 2507.21848 • Published Jul 29, 2025 • 8
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis Paper • 2507.23785 • Published Jul 31, 2025 • 18
ReasonGen-R1 Collection Model and Datasets for the paper "ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL • 7 items • Updated Jun 2, 2025 • 6
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again Paper • 2507.22058 • Published Jul 29, 2025 • 39
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published Jul 31, 2025 • 44
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21, 2025 • 68
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 122