Minghong Cai's picture

14 6

Minghong Cai

onevfall

·

https://onevfall.github.io/personal_page/

AI & ML interests

Video generation, Video editing

Recent Activity

authored a paper 4 days ago

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

upvoted a paper 4 days ago

UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution

upvoted a paper 4 days ago

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution

Paper • 2510.08143 • Published 4 days ago • 20

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

Paper • 2510.08555 • Published 4 days ago • 59

upvoted a paper 15 days ago

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

Paper • 2509.22644 • Published 17 days ago • 19

upvoted a paper 16 days ago

EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

Paper • 2509.20360 • Published 19 days ago • 17

upvoted a paper 2 months ago

ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents

Paper • 2507.22827 • Published Jul 30 • 98

upvoted a paper 4 months ago

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

Paper • 2506.01943 • Published Jun 2 • 25

upvoted 2 papers 5 months ago

Scaling Image and Video Generation via Test-Time Evolutionary Search

Paper • 2505.17618 • Published May 23 • 41

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8 • 84

upvoted 2 papers 7 months ago

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published Mar 18 • 48

VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control

Paper • 2503.05639 • Published Mar 7 • 24

upvoted 2 papers 10 months ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

Paper • 2412.18597 • Published Dec 24, 2024 • 20

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Paper • 2412.07759 • Published Dec 10, 2024 • 18

upvoted 2 papers over 1 year ago

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 73

Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition

Paper • 2404.02514 • Published Apr 3, 2024 • 11