SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published Dec 10, 2024 • 54
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published Jan 23 • 41
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 44
TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space Paper • 2501.12224 • Published Jan 21 • 47
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published Jan 2 • 54
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints Paper • 2501.03841 • Published Jan 7 • 54
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6 • 55
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Paper • 2502.08639 • Published Feb 12 • 40
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14 • 53