UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution Paper • 2510.08143 • Published 4 days ago • 20
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published 4 days ago • 59
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning Paper • 2509.22644 • Published 17 days ago • 19
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning Paper • 2509.20360 • Published 19 days ago • 17
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30 • 98
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control Paper • 2506.01943 • Published Jun 2 • 25
Scaling Image and Video Generation via Test-Time Evolutionary Search Paper • 2505.17618 • Published May 23 • 41
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published Mar 18 • 48
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control Paper • 2503.05639 • Published Mar 7 • 24
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published Dec 24, 2024 • 20
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation Paper • 2412.07759 • Published Dec 10, 2024 • 18
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3, 2024 • 73
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition Paper • 2404.02514 • Published Apr 3, 2024 • 11