MotionPro

πŸ–₯️ GitHub    |    🌐 Project Page    |   πŸ€— Hugging Face   |    πŸ“‘ Paper    |    πŸ“– PDF   

MotionPro: A Precise Motion Controller for Image-to-Video Generation

πŸ”† If you find MotionPro useful, please give a ⭐ for this repo, which is important to Open-Source projects. Thanks!

In this repository, we introduce MotionPro, an image-to-video generation model built on SVD. MotionPro learns object and camera motion control from in-the-wild video datasets (e.g., WebVid-10M) without applying special data filtering. The model offers the following key features:

  • User-friendly interaction. Our model requires only simple conditional inputs, allowing users to achieve I2V motion control generation through brushing and dragging.
  • Simultaneous control of object and camera motion. Our trained MotionPro model supports simultaneous object and camera motion control. Moreover, our model can achieve precise camera control driven by pose without requiring training on a specific camera-pose paired dataset. More Details
  • Synchronized video generation. This is an extension of our model. By combining MotionPro and MotionPro-Dense, we can achieve synchronized video generation. More Details

Additionally, our repository provides more tools to benefit the research community's development.:

  • Memory optimization for training. We provide a training framework based on PyTorch Lightning, optimized for memory efficiency, enabling SVD fine-tuning with a batch size of 8 per NVIDIA A100 GPU.
  • Data construction tools. We offer scripts for constructing training data. Additionally, we also provide code for loading datasets in two formats, supporting video input from both folders (Dataset) and tar files (WebDataset).
  • MC-Bench and evaluation code. We constructed MC-Bench with 1.1K user-annotated image-trajectory pairs, along with evaluation scripts for comprehensive assessments. All the images showcased on the project page can be found here.

Video Demos

Examples of different motion control types by our MotionPro.

πŸ”₯ Updates

  • [2025.03.26] Release inference and training code.
  • [2025.03.27] Upload gradio demo usage video.
  • [2025.03.29] Release MC-Bench and evaluation code.
  • [2025.03.30] Upload annotation tool for image-trajectory pair construction.

πŸƒπŸΌ Inference

Environment Requirement

Clone the repo:

git clone https://github.com/HiDream-ai/MotionPro.git

Install dependencies:

conda create -n motionpro python=3.10.0
conda activate motionpro
pip install -r requirements.txt
Model Download
Models Download Link Notes
MotionPro πŸ€—Huggingface Supports both object and camera control. This is the default model mentioned in the paper.
MotionPro-Dense πŸ€—Huggingface Supports synchronized video generation when combined with MotionPro. MotionPro-Dense shares the same architecture as Motion, but the input conditions are modified to include: dense optical flow and per-frame visibility masks relative to the first frame.

Download the model from HuggingFace at high speeds (30-75MB/s):

cd tools/huggingface_down
bash download_hfd.sh
Run Motion Control

This section of the code supports simultaneous object motion and camera motion control. We provide a user-friendly Gradio demo interface that allows users to control motion with simple brushing and dragging operations. The instructional video can be found in assets/demo.mp4 (please note the version of gradio).

python demo_sparse_flex_wh.py

When you expect all pixels to move (e.g., for camera control), you need to use the brush to fully cover the entire area. You can also test the demo using assets/logo.png.

Additionally, users can also generate controllable image-to-video results using pre-defined camera trajectories. Note that our model has not been trained on a specific camera control dataset. Test the demo using assets/sea.png.

python demo_sparse_flex_wh_pure_camera.py
Run synchronized video generation and video recapture

By combining MotionPro and MotionPro-Dense, we can achieve the following functionalities:

  • Synchronized video generation. We assume that two videos, pure_obj_motion.mp4 and pure_camera_motion.mp4, have been generated using the respective demos. By combining their motion flows and using the result as a condition for MotionPro-Dense, we obtain final_video. By pairing the same object motion with different camera motions, we can generate synchronized videos where the object motion remains consistent while the camera motion varies. More Details

Here, you need to first download the model_weights of cotracker and place them in the tools/co-tracker/checkpoints directory.

python inference_dense.py --ori_video 'assets/cases/dog_pure_obj_motion.mp4' --camera_video 'assets/cases/dog_pure_camera_motion_1.mp4' --save_name 'syn_video.mp4' --ckpt_path 'MotionPro-Dense CKPT-PATH'

πŸš€ Training

Data Prepare

We have packaged several demo videos to help users debug the training code. Simply πŸ€—download, extract the files, and place them in the ./data directory.

Additionally, ./data/dot_single_video contains code for processing raw videos using DOT to generate the necessary conditions for training, making it easier for the community to create training datasets.

Train

Simply run the following command to train MotionPro:

train_server_1.sh

In addition to loading video data from folders, we also support WebDataset, allowing videos to be read directly from tar files for training. This can be enabled by modifying the config file:

train_debug_from_folder.yaml -> train_debug_from_tar.yaml 

Furthermore, to train the MotionPro-Dense model, simply modify the train_debug_from_tar.yaml file by changing VidTar to VidTar_all_flow and updating the ckpt_path.

🌟 Star and Citation

If you find our work helpful for your research, please consider giving a star⭐ on this repository and citing our workπŸ“.

@inproceedings{2025motionpro,
  title={MotionPro: A Precise Motion Controller for Image-to-Video Generation},
  author={Zhongwei Zhang, Fuchen Long, Zhaofan Qiu, Yingwei Pan, Wu Liu, Ting Yao and Tao Mei},
  booktitle={CVPR},
  year={2025}
}

πŸ’– Acknowledgement

Our code is inspired by several works, including SVD, DragNUWA, DOT, Cotracker. Thanks to all the contributors!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for HiDream-ai/MotionPro

Finetuned
(2)
this model