Chain-of-Thought Tokens are Computer Program Variables Paper β’ 2505.04955 β’ Published 4 days ago β’ 6
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper β’ 2505.04512 β’ Published 5 days ago β’ 32
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Paper β’ 2504.21855 β’ Published 11 days ago β’ 12
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 β’ 179
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution Paper β’ 2505.00497 β’ Published 11 days ago β’ 14
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper β’ 2505.00703 β’ Published 10 days ago β’ 39
Make-A-Character 2: Animatable 3D Character Generation From a Single Image Paper β’ 2501.07870 β’ Published Jan 14 β’ 1