Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published 10 days ago • 48 • 3
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published 16 days ago • 39 • 3
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published 16 days ago • 39 • 3
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Paper • 2504.07405 • Published 17 days ago • 12 • 2
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Paper • 2504.02949 • Published 24 days ago • 20 • 2
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published 24 days ago • 54 • 6
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published 24 days ago • 56 • 3
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention Paper • 2503.19907 • Published Mar 25 • 8 • 2
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation Paper • 2503.14428 • Published Mar 18 • 8 • 2
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization Paper • 2503.12689 • Published Mar 16 • 5 • 2
Concat-ID: Towards Universal Identity-Preserving Video Synthesis Paper • 2503.14151 • Published Mar 18 • 10 • 2