No changes necessary
Browse filesNo changes are necessary.
README.md
CHANGED
@@ -1,14 +1,15 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
language:
|
4 |
- en
|
5 |
- zh
|
6 |
-
pipeline_tag: image-to-video
|
7 |
library_name: diffusers
|
|
|
|
|
8 |
tags:
|
9 |
- video
|
10 |
- video-generation
|
11 |
---
|
|
|
12 |
# Wan2.1
|
13 |
|
14 |
<p align="center">
|
@@ -16,12 +17,12 @@ tags:
|
|
16 |
<p>
|
17 |
|
18 |
<p align="center">
|
19 |
-
π <a href=""><b>Wan</b></a>    ο½    π₯οΈ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a>    |   π€ <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>   |   π€ <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>   |    π <a href="">
|
20 |
<br>
|
21 |
|
22 |
-----
|
23 |
|
24 |
-
[**Wan: Open and Advanced Large-Scale Video Generative Models**]() <be>
|
25 |
|
26 |
In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
|
27 |
- π **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
|
@@ -45,7 +46,15 @@ This repo contains our I2V-14B model, which is capable of generating 480P videos
|
|
45 |
|
46 |
## π₯ Latest News!!
|
47 |
|
48 |
-
*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
|
51 |
## π Todo List
|
@@ -53,27 +62,29 @@ This repo contains our I2V-14B model, which is capable of generating 480P videos
|
|
53 |
- [x] Multi-GPU Inference code of the 14B and 1.3B models
|
54 |
- [x] Checkpoints of the 14B and 1.3B models
|
55 |
- [x] Gradio demo
|
56 |
-
- [
|
57 |
-
- [
|
|
|
58 |
- Wan2.1 Image-to-Video
|
59 |
- [x] Multi-GPU Inference code of the 14B model
|
60 |
- [x] Checkpoints of the 14B model
|
61 |
- [x] Gradio demo
|
62 |
-
- [
|
63 |
-
- [
|
|
|
64 |
|
65 |
|
66 |
## Quickstart
|
67 |
|
68 |
#### Installation
|
69 |
Clone the repo:
|
70 |
-
```
|
71 |
git clone https://github.com/Wan-Video/Wan2.1.git
|
72 |
cd Wan2.1
|
73 |
```
|
74 |
|
75 |
Install dependencies:
|
76 |
-
```
|
77 |
# Ensure torch >= 2.4.0
|
78 |
pip install -r requirements.txt
|
79 |
```
|
@@ -91,16 +102,16 @@ pip install -r requirements.txt
|
|
91 |
> π‘Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution.
|
92 |
|
93 |
|
94 |
-
Download models using
|
95 |
-
```
|
96 |
pip install "huggingface_hub[cli]"
|
97 |
-
huggingface-cli download Wan-AI/Wan2.1-
|
98 |
```
|
99 |
|
100 |
-
Download models using
|
101 |
-
```
|
102 |
pip install modelscope
|
103 |
-
modelscope download Wan-AI/Wan2.1-
|
104 |
```
|
105 |
|
106 |
#### Run Image-to-Video Generation
|
@@ -135,10 +146,10 @@ Similar to Text-to-Video, Image-to-Video is also divided into processes with and
|
|
135 |
</table>
|
136 |
|
137 |
|
138 |
-
##### (1) Without Prompt
|
139 |
|
140 |
- Single-GPU inference
|
141 |
-
```
|
142 |
python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
143 |
```
|
144 |
|
@@ -146,26 +157,26 @@ python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480
|
|
146 |
|
147 |
- Multi-GPU inference using FSDP + xDiT USP
|
148 |
|
149 |
-
```
|
150 |
pip install "xfuser>=0.4.1"
|
151 |
torchrun --nproc_per_node=8 generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
152 |
```
|
153 |
|
154 |
-
##### (2) Using Prompt
|
155 |
|
156 |
-
Run with local prompt
|
157 |
-
```
|
158 |
python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_model Qwen/Qwen2.5-VL-7B-Instruct --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
159 |
```
|
160 |
|
161 |
-
Run with remote prompt
|
162 |
-
```
|
163 |
DASH_API_KEY=your_key python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_method 'dashscope' --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
164 |
```
|
165 |
|
166 |
##### (3) Runing local gradio
|
167 |
|
168 |
-
```
|
169 |
cd gradio
|
170 |
# if one only uses 480P model in gradio
|
171 |
DASH_API_KEY=your_key python i2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir_480p ./Wan2.1-I2V-14B-480P
|
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
- en
|
4 |
- zh
|
|
|
5 |
library_name: diffusers
|
6 |
+
license: apache-2.0
|
7 |
+
pipeline_tag: image-to-video
|
8 |
tags:
|
9 |
- video
|
10 |
- video-generation
|
11 |
---
|
12 |
+
|
13 |
# Wan2.1
|
14 |
|
15 |
<p align="center">
|
|
|
17 |
<p>
|
18 |
|
19 |
<p align="center">
|
20 |
+
π <a href="https://wan.video"><b>Wan</b></a>    ο½    π₯οΈ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a>    |   π€ <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>   |   π€ <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>   |    π <a href="https://files.alicdn.com/tpsservice/5c9de1c74de03972b7aa657e5a54756b.pdf">Technical Report</a>    |    π <a href="https://wan.video/welcome?spm=a2ty_o02.30011076.0.0.6c9ee41eCcluqg">Blog</a>    |   π¬ <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>   |    π <a href="https://discord.gg/AKNgpMK4Yj">Discord</a>  
|
21 |
<br>
|
22 |
|
23 |
-----
|
24 |
|
25 |
+
[**Wan: Open and Advanced Large-Scale Video Generative Models**]("") <be>
|
26 |
|
27 |
In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
|
28 |
- π **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
|
|
|
46 |
|
47 |
## π₯ Latest News!!
|
48 |
|
49 |
+
* Mar 21, 2025: π We are excited to announce the release of the **Wan2.1** [technical report](https://files.alicdn.com/tpsservice/5c9de1c74de03972b7aa657e5a54756b.pdf). We welcome discussions and feedback!
|
50 |
+
* Mar 3, 2025: π **Wan2.1**'s T2V and I2V have been integrated into Diffusers ([T2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanPipeline) | [I2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanImageToVideoPipeline)). Feel free to give it a try!
|
51 |
+
* Feb 27, 2025: π **Wan2.1** has been integrated into [ComfyUI](https://comfyanonymous.github.io/ComfyUI_examples/wan/). Enjoy!
|
52 |
+
* Feb 25, 2025: π We've released the inference code and weights of **Wan2.1**.
|
53 |
+
|
54 |
+
## Community Works
|
55 |
+
If your work has improved **Wan2.1** and you would like more people to see it, please inform us.
|
56 |
+
- [TeaCache](https://github.com/ali-vilab/TeaCache) now supports **Wan2.1** acceleration, capable of increasing speed by approximately 2x. Feel free to give it a try!
|
57 |
+
- [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) provides more support for **Wan2.1**, including video-to-video, FP8 quantization, VRAM optimization, LoRA training, and more. Please refer to [their examples](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/wanvideo).
|
58 |
|
59 |
|
60 |
## π Todo List
|
|
|
62 |
- [x] Multi-GPU Inference code of the 14B and 1.3B models
|
63 |
- [x] Checkpoints of the 14B and 1.3B models
|
64 |
- [x] Gradio demo
|
65 |
+
- [x] ComfyUI integration
|
66 |
+
- [x] Diffusers integration
|
67 |
+
- [ ] Diffusers + Multi-GPU Inference
|
68 |
- Wan2.1 Image-to-Video
|
69 |
- [x] Multi-GPU Inference code of the 14B model
|
70 |
- [x] Checkpoints of the 14B model
|
71 |
- [x] Gradio demo
|
72 |
+
- [x] ComfyUI integration
|
73 |
+
- [x] Diffusers integration
|
74 |
+
- [ ] Diffusers + Multi-GPU Inference
|
75 |
|
76 |
|
77 |
## Quickstart
|
78 |
|
79 |
#### Installation
|
80 |
Clone the repo:
|
81 |
+
```sh
|
82 |
git clone https://github.com/Wan-Video/Wan2.1.git
|
83 |
cd Wan2.1
|
84 |
```
|
85 |
|
86 |
Install dependencies:
|
87 |
+
```sh
|
88 |
# Ensure torch >= 2.4.0
|
89 |
pip install -r requirements.txt
|
90 |
```
|
|
|
102 |
> π‘Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution.
|
103 |
|
104 |
|
105 |
+
Download models using huggingface-cli:
|
106 |
+
``` sh
|
107 |
pip install "huggingface_hub[cli]"
|
108 |
+
huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B
|
109 |
```
|
110 |
|
111 |
+
Download models using modelscope-cli:
|
112 |
+
``` sh
|
113 |
pip install modelscope
|
114 |
+
modelscope download Wan-AI/Wan2.1-T2V-14B --local_dir ./Wan2.1-T2V-14B
|
115 |
```
|
116 |
|
117 |
#### Run Image-to-Video Generation
|
|
|
146 |
</table>
|
147 |
|
148 |
|
149 |
+
##### (1) Without Prompt Extension
|
150 |
|
151 |
- Single-GPU inference
|
152 |
+
```sh
|
153 |
python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
154 |
```
|
155 |
|
|
|
157 |
|
158 |
- Multi-GPU inference using FSDP + xDiT USP
|
159 |
|
160 |
+
```sh
|
161 |
pip install "xfuser>=0.4.1"
|
162 |
torchrun --nproc_per_node=8 generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
163 |
```
|
164 |
|
165 |
+
##### (2) Using Prompt Extension
|
166 |
|
167 |
+
Run with local prompt extension using `Qwen/Qwen2.5-VL-7B-Instruct`:
|
168 |
+
```sh
|
169 |
python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_model Qwen/Qwen2.5-VL-7B-Instruct --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
170 |
```
|
171 |
|
172 |
+
Run with remote prompt extension using `dashscope`:
|
173 |
+
```sh
|
174 |
DASH_API_KEY=your_key python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_method 'dashscope' --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
175 |
```
|
176 |
|
177 |
##### (3) Runing local gradio
|
178 |
|
179 |
+
```sh
|
180 |
cd gradio
|
181 |
# if one only uses 480P model in gradio
|
182 |
DASH_API_KEY=your_key python i2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir_480p ./Wan2.1-I2V-14B-480P
|