angtian commited on
Commit
cece5c4
·
verified ·
1 Parent(s): da61661

Add Readme

Browse files
.gitattributes CHANGED
@@ -33,3 +33,16 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/editor1.PNG filter=lfs diff=lfs merge=lfs -text
37
+ assets/editor2.PNG filter=lfs diff=lfs merge=lfs -text
38
+ assets/editor5.PNG filter=lfs diff=lfs merge=lfs -text
39
+ assets/examples/00.gif filter=lfs diff=lfs merge=lfs -text
40
+ assets/examples/00.jpg filter=lfs diff=lfs merge=lfs -text
41
+ assets/examples/01.gif filter=lfs diff=lfs merge=lfs -text
42
+ assets/examples/02.gif filter=lfs diff=lfs merge=lfs -text
43
+ assets/examples/02.jpg filter=lfs diff=lfs merge=lfs -text
44
+ assets/examples/03.gif filter=lfs diff=lfs merge=lfs -text
45
+ assets/examples/04.gif filter=lfs diff=lfs merge=lfs -text
46
+ assets/examples/04.jpg filter=lfs diff=lfs merge=lfs -text
47
+ assets/examples/05.gif filter=lfs diff=lfs merge=lfs -text
48
+ assets/examples/05.jpg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,184 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ATI: Any Trajectory Instruction for Controllable Video Generation
2
+
3
+ <div align="center">
4
+
5
+ [![arXiv](https://img.shields.io/badge/arXiv%20paper-2505.22944-b31b1b.svg)](https://arxiv.org/pdf/2505.22944)&nbsp;
6
+ [![project page](https://img.shields.io/badge/Project_page-ATI-green)](https://anytraj.github.io/)&nbsp;
7
+ <a href="https://huggingface.co/bytedance-research/ATI/"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=orange"></a>
8
+ </div>
9
+
10
+
11
+ > [**ATI: Any Trajectory Instruction for Controllable Video Generation**](https://anytraj.github.io/)<br>
12
+ > [Angtian Wang](https://angtianwang.github.io/), [Haibin Huang](https://brotherhuang.github.io/), Jacob Zhiyuan Fang, [Yiding Yang](https://ihollywhy.github.io/), [Chongyang Ma](http://www.chongyangma.com/),
13
+ > <br>Intelligent Creation Team, ByteDance<br>
14
+
15
+ [![Watch the video](assets/thumbnail.jpg)](https://youtu.be/76jjPT0f8Hs)
16
+
17
+ This is the repo for Wan2.1 ATI (Any Trajectory Instruction for Controllable Video Generation), a trajectory-based motion control framework that unifies object, local and camera movements in video generation. This repo is based on [Wan2.1 offical implementation](https://github.com/Wan-Video/Wan2.1).
18
+
19
+ ## Install
20
+
21
+ ATI requires a same environment as offical Wan 2.1. Follow the instruction of INSTALL.md (Wan2.1).
22
+
23
+ ```
24
+ git clone https://github.com/bytedance/ATI.git
25
+ cd ATI
26
+ ```
27
+
28
+ Install packages
29
+
30
+ ```
31
+ pip install .
32
+ ```
33
+
34
+ First you need to download the 14B original model of Wan2.1.
35
+
36
+ ```
37
+ huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./Wan2.1-I2V-14B-480P
38
+ ```
39
+
40
+ Then download ATI-Wan model from our huggingface repo.
41
+
42
+ ```
43
+ huggingface-cli download bytedance-research/ATI --local-dir ./Wan2.1-ATI-14B-480P
44
+ ```
45
+
46
+ Finally, copy VAE, T5 and other misc checkpoint from origin Wan2.1 folder to ATI checkpoint location
47
+
48
+ ```
49
+ cp ./Wan2.1-I2V-14B-480P/Wan2.1_VAE.pth ./Wan2.1-ATI-14B-480P/
50
+ cp ./Wan2.1-I2V-14B-480P/models_t5_umt5-xxl-enc-bf16.pth ./Wan2.1-ATI-14B-480P/
51
+ cp ./Wan2.1-I2V-14B-480P/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth ./Wan2.1-ATI-14B-480P/
52
+ cp -r ./Wan2.1-I2V-14B-480P/xlm-roberta-large ./Wan2.1-ATI-14B-480P/
53
+ cp -r ./Wan2.1-I2V-14B-480P/google ./Wan2.1-ATI-14B-480P/
54
+ ```
55
+
56
+ ## Run
57
+
58
+ Frist download the ATI Wan2.1 from our HuggingFace Page.
59
+
60
+ We provide a demo sript to run ATI.
61
+
62
+ ```
63
+ bash run_example.sh -p examples/test.yaml -c ./Wan2.1-ATI-14B-480P -o samples
64
+ ```
65
+ where `-p` is the path to the config file, `-c` is the path to the checkpoint, `-o` is the path to the output directory, `-g` defines the number of gpus to use (if unspecificed, all avalible GPUs will be used; if `1` is given, will run on single process mode).
66
+
67
+ Once finished, you will expect to fine:
68
+ - `samples/outputs` for the raw output videos.
69
+ - `samples/images_tracks` shows the input image togather with the user specified trajectories.
70
+ - `samples/outputs_vis` shows the output videos togather with the user specified trajectories.
71
+
72
+ Expected results:
73
+
74
+
75
+ <table style="width: 100%; border-collapse: collapse; text-align: center; border: 1px solid #ccc;">
76
+ <tr>
77
+ <th style="text-align: center;">
78
+ <strong>Input Image & Trajectory</strong>
79
+ </th>
80
+ <th style="text-align: center;">
81
+ <strong>Generated Videos (Superimposed Trajectories)</strong>
82
+ </th>
83
+ </tr>
84
+
85
+ <tr>
86
+ <td style="text-align: center; vertical-align: middle;">
87
+ <img src="assets/examples/00.jpg" alt="Image 0" style="height: 240px;">
88
+ </td>
89
+ <td style="text-align: center; vertical-align: middle;">
90
+ <img src="assets/examples/00.gif" alt="Image 0" style="height: 240px;">
91
+ </td>
92
+ </tr>
93
+
94
+ <tr>
95
+ <td style="text-align: center; vertical-align: middle;">
96
+ <img src="assets/examples/01.jpg" alt="Image 1" style="height: 240px;">
97
+ </td>
98
+ <td style="text-align: center; vertical-align: middle;">
99
+ <img src="assets/examples/01.gif" alt="Image 1" style="height: 240px;">
100
+ </td>
101
+ </tr>
102
+
103
+ <tr>
104
+ <td style="text-align: center; vertical-align: middle;">
105
+ <img src="assets/examples/02.jpg" alt="Image 2" style="height: 160px;">
106
+ </td>
107
+ <td style="text-align: center; vertical-align: middle;">
108
+ <img src="assets/examples/02.gif" alt="Image 2" style="height: 160px;">
109
+ </td>
110
+ </tr>
111
+
112
+ </tr>
113
+ <tr>
114
+ <td style="text-align: center; vertical-align: middle;">
115
+ <img src="assets/examples/03.jpg" alt="Image 3" style="height: 220px;">
116
+ </td>
117
+ <td style="text-align: center; vertical-align: middle;">
118
+ <img src="assets/examples/03.gif" alt="Image 3" style="height: 220px;">
119
+ </td>
120
+ </tr>
121
+
122
+ <tr>
123
+ <td style="text-align: center; vertical-align: middle;">
124
+ <img src="assets/examples/04.jpg" alt="Image 4" style="height: 240px;">
125
+ </td>
126
+ <td style="text-align: center; vertical-align: middle;">
127
+ <img src="assets/examples/04.gif" alt="Image 4" style="height: 240px;">
128
+ </td>
129
+ </tr>
130
+
131
+ <tr>
132
+ <td style="text-align: center; vertical-align: middle;">
133
+ <img src="assets/examples/05.jpg" alt="Image 5" style="height: 160px;">
134
+ </td>
135
+ <td style="text-align: center; vertical-align: middle;">
136
+ <img src="assets/examples/05.gif" alt="Image 5" style="height: 160px;">
137
+ </td>
138
+ </tr>
139
+ </table>
140
+
141
+
142
+ ## Create You Own Trajectory
143
+
144
+ We provide an interactive tool that allow users to draw and edit trajectories on their images.
145
+
146
+ 1. First run:
147
+ ```
148
+ cd tools/trajectory_editor
149
+ python3 app.py
150
+ ```
151
+ then open this url [localhost:5000](http://localhost:5000/) in the browser. Note if you run the editor on the server, you need to replace `localhost` with the server's IP address.
152
+
153
+ 2. Get the interface shown below, then click **Choose File** to open a local image.
154
+ ![Interface Screenshot](assets/editor0.PNG)
155
+
156
+ 3. Available trajectory functions:
157
+ ![Trajectory Functions](assets/editor1.PNG)
158
+
159
+ a. **Free Trajectory**: Click and then drag with the mouse directly on the image.
160
+ b. **Circular (Camera Control)**:
161
+ - Place a circle on the image, then drag to set its size for frame 0.
162
+ - Place a few (3–4 recommended) track points on the circle.
163
+ - Drag the radius control to achieve zoom-in/zoom-out effects.
164
+
165
+ c. **Static Point**: A point that remains stationary over time.
166
+
167
+ *Note:* Pay attention to the progress bar in the box to control motion speed.
168
+ ![Progress Control](assets/editor2.PNG)
169
+
170
+ 4. **Trajectory Editing**: Select a trajectory here, then delete, edit, or copy it. In edit mode, drag the trajectory directly on the image. The selected trajectory is highlighted by color.
171
+ ![Trajectory Editing](assets/editor3.PNG)
172
+
173
+ 5. **Camera Pan Control**: Enter horizontal (X) or vertical (Y) speed (pixels per frame). Positive X moves right; negative X moves left. Positive Y moves down; negative Y moves up. Click **Add to Selected** to apply to the current trajectory, or **Add to All** to apply to all trajectories. The selected points will gain a constant pan motion on top of their existing movement.
174
+ ![Camera Pan Control](assets/editor4.PNG)
175
+
176
+ 6. **Important:** After editing, click **Store Tracks** to save. Each image (not each trajectory) must be saved separately after drawing all trajectories.
177
+ ![Store Tracks](assets/editor5.PNG)
178
+
179
+ 7. Once all edits are complete, locate the `videos_example` folder in the **Trajectory Editor**.
180
+
181
+
182
+ ## Citation
183
+ Please cite our paper if you find our work useful:
184
+
assets/editor0.PNG ADDED
assets/editor1.PNG ADDED

Git LFS Details

  • SHA256: 3f01392f632dc5ad7289ee9995ca8d0e6e601d1f841719fbb1850991158eec5d
  • Pointer size: 132 Bytes
  • Size of remote file: 1.93 MB
assets/editor2.PNG ADDED

Git LFS Details

  • SHA256: 2d9fee9414cbc51bffa9396bd3dd5f9849a92b8ae6ecb293c3216d20915135de
  • Pointer size: 131 Bytes
  • Size of remote file: 554 kB
assets/editor3.PNG ADDED
assets/editor4.PNG ADDED
assets/editor5.PNG ADDED

Git LFS Details

  • SHA256: 5664bc02d61972ea637cc2c5a4995ef0ced4852013cd34ebc0d199778681386f
  • Pointer size: 131 Bytes
  • Size of remote file: 554 kB
assets/examples/00.gif ADDED

Git LFS Details

  • SHA256: 6d1120c794abd095f7ee97a1385d98eb5f7a8760bf2c68bab95f4c4eb8391926
  • Pointer size: 133 Bytes
  • Size of remote file: 12.9 MB
assets/examples/00.jpg ADDED

Git LFS Details

  • SHA256: 0f6cf0edbc2915fcb58c5ec845e62c31a4edd8ce2901d2a73fa165d0e05e48d1
  • Pointer size: 131 Bytes
  • Size of remote file: 112 kB
assets/examples/01.gif ADDED

Git LFS Details

  • SHA256: 661dc60dfdf1ab1b26d9a193b89d6a416d039c80a1a2edc4e86c80ca53de4f5e
  • Pointer size: 132 Bytes
  • Size of remote file: 8.55 MB
assets/examples/01.jpg ADDED
assets/examples/02.gif ADDED

Git LFS Details

  • SHA256: 053db8a04980672dfd948d37f891a47a22da63f72fc7da1467c9e12271e7ef49
  • Pointer size: 133 Bytes
  • Size of remote file: 12.6 MB
assets/examples/02.jpg ADDED

Git LFS Details

  • SHA256: fb188f806aee8f4a1d56d866ea3cbbf64cf5c5f8587e5f633b8e582693f383db
  • Pointer size: 131 Bytes
  • Size of remote file: 268 kB
assets/examples/03.gif ADDED

Git LFS Details

  • SHA256: 607aea4127587da331983b40d55d1c61f2dc7037a5e43815583edf3b204f49f6
  • Pointer size: 132 Bytes
  • Size of remote file: 3.76 MB
assets/examples/03.jpg ADDED
assets/examples/04.gif ADDED

Git LFS Details

  • SHA256: c94420acc9a80d7ae518b5de6855eac11bd1d18cb11033d6f9f300cded474d52
  • Pointer size: 132 Bytes
  • Size of remote file: 5.47 MB
assets/examples/04.jpg ADDED

Git LFS Details

  • SHA256: 581845477c0cc1309e4b7a7a5ac748a228478ff5158c6ae4dd5463ff7c949f77
  • Pointer size: 131 Bytes
  • Size of remote file: 501 kB
assets/examples/05.gif ADDED

Git LFS Details

  • SHA256: 5eca33b954dd7cf780f07c82bbf4f23d9333ff756a5ee7c3f9eb345cdf77e722
  • Pointer size: 133 Bytes
  • Size of remote file: 15 MB
assets/examples/05.jpg ADDED

Git LFS Details

  • SHA256: 325e453692aee4aa301d56e5a3746b90dd099c255b43ff0c3306f6d134c7ed27
  • Pointer size: 131 Bytes
  • Size of remote file: 793 kB
assets/thumbnail.jpg ADDED