Wan-AI
/

Wan2.1-I2V-14B-480P

@@ -1,14 +1,15 @@
 ---
-license: apache-2.0
 language:
 - en
 - zh
-pipeline_tag: image-to-video
 library_name: diffusers
 tags:
 - video
 - video-generation
 ---
 # Wan2.1
 <p align="center">
@@ -16,12 +17,12 @@ tags:
 <p>
 <p align="center">
-    💜 <a href=""><b>Wan</b></a> &nbsp&nbsp ｜ &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp  | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="">Paper (Coming soon)</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wanxai.com">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/p5XbdQV7">Discord</a>&nbsp&nbsp
 <br>
 -----
-[**Wan: Open and Advanced Large-Scale Video Generative Models**]() <be>
 In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
 - 👍 **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
@@ -45,7 +46,15 @@ This repo contains our I2V-14B model, which is capable of generating 480P videos
 ## 🔥 Latest News!!
-* Feb 25, 2025: 👋 We've released the inference code and weights of Wan2.1.
 ## 📑 Todo List
@@ -53,27 +62,29 @@ This repo contains our I2V-14B model, which is capable of generating 480P videos
     - [x] Multi-GPU Inference code of the 14B and 1.3B models
     - [x] Checkpoints of the 14B and 1.3B models
     - [x] Gradio demo
-    - [ ] Diffusers integration
-    - [ ] ComfyUI integration
 - Wan2.1 Image-to-Video
     - [x] Multi-GPU Inference code of the 14B model
     - [x] Checkpoints of the 14B model
     - [x] Gradio demo
-    - [ ] Diffusers integration
-    - [ ] ComfyUI integration
 ## Quickstart
 #### Installation
 Clone the repo:
-```
 git clone https://github.com/Wan-Video/Wan2.1.git
 cd Wan2.1
 ```
 Install dependencies:
-```
 # Ensure torch >= 2.4.0
 pip install -r requirements.txt
 ```
@@ -91,16 +102,16 @@ pip install -r requirements.txt
 > 💡Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution.
-Download models using 🤗 huggingface-cli:
-```
 pip install "huggingface_hub[cli]"
-huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./Wan2.1-I2V-14B-480P
 ```
-Download models using 🤖 modelscope-cli:
-```
 pip install modelscope
-modelscope download Wan-AI/Wan2.1-I2V-14B-480P --local_dir ./Wan2.1-I2V-14B-480P
 ```
 #### Run Image-to-Video Generation
@@ -135,10 +146,10 @@ Similar to Text-to-Video, Image-to-Video is also divided into processes with and
 </table>
-##### (1) Without Prompt Extention
 - Single-GPU inference
-```
 python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
@@ -146,26 +157,26 @@ python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480
 - Multi-GPU inference using FSDP + xDiT USP
-```
 pip install "xfuser>=0.4.1"
 torchrun --nproc_per_node=8 generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
-##### (2) Using Prompt Extention
-Run with local prompt extention using `Qwen/Qwen2.5-VL-7B-Instruct`:
-```
 python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_model Qwen/Qwen2.5-VL-7B-Instruct --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
-Run with remote prompt extention using `dashscope`:
-```
 DASH_API_KEY=your_key python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_method 'dashscope' --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
 ##### (3) Runing local gradio
-```
 cd gradio
 # if one only uses 480P model in gradio
 DASH_API_KEY=your_key python i2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir_480p ./Wan2.1-I2V-14B-480P

 ---
 language:
 - en
 - zh
 library_name: diffusers
+license: apache-2.0
+pipeline_tag: image-to-video
 tags:
 - video
 - video-generation
 ---
 # Wan2.1
 <p align="center">
 <p>
 <p align="center">
+    💜 <a href="https://wan.video"><b>Wan</b></a> &nbsp&nbsp ｜ &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp  | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://files.alicdn.com/tpsservice/5c9de1c74de03972b7aa657e5a54756b.pdf">Technical Report</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wan.video/welcome?spm=a2ty_o02.30011076.0.0.6c9ee41eCcluqg">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/AKNgpMK4Yj">Discord</a>&nbsp&nbsp
 <br>
 -----
+[**Wan: Open and Advanced Large-Scale Video Generative Models**]("") <be>
 In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
 - 👍 **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
 ## 🔥 Latest News!!
+* Mar 21, 2025: 👋 We are excited to announce the release of the **Wan2.1** [technical report](https://files.alicdn.com/tpsservice/5c9de1c74de03972b7aa657e5a54756b.pdf). We welcome discussions and feedback!
+* Mar 3, 2025: 👋 **Wan2.1**'s T2V and I2V have been integrated into Diffusers ([T2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanPipeline) | [I2V](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan#diffusers.WanImageToVideoPipeline)). Feel free to give it a try!
+* Feb 27, 2025: 👋 **Wan2.1** has been integrated into [ComfyUI](https://comfyanonymous.github.io/ComfyUI_examples/wan/). Enjoy!
+* Feb 25, 2025: 👋 We've released the inference code and weights of **Wan2.1**.
+## Community Works
+If your work has improved **Wan2.1** and you would like more people to see it, please inform us.
+- [TeaCache](https://github.com/ali-vilab/TeaCache) now supports **Wan2.1** acceleration, capable of increasing speed by approximately 2x. Feel free to give it a try!
+- [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) provides more support for **Wan2.1**, including video-to-video, FP8 quantization, VRAM optimization, LoRA training, and more. Please refer to [their examples](https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/wanvideo).
 ## 📑 Todo List
     - [x] Multi-GPU Inference code of the 14B and 1.3B models
     - [x] Checkpoints of the 14B and 1.3B models
     - [x] Gradio demo
+    - [x] ComfyUI integration
+    - [x] Diffusers integration
+    - [ ] Diffusers + Multi-GPU Inference
 - Wan2.1 Image-to-Video
     - [x] Multi-GPU Inference code of the 14B model
     - [x] Checkpoints of the 14B model
     - [x] Gradio demo
+    - [x] ComfyUI integration
+    - [x] Diffusers integration
+    - [ ] Diffusers + Multi-GPU Inference
 ## Quickstart
 #### Installation
 Clone the repo:
+```sh
 git clone https://github.com/Wan-Video/Wan2.1.git
 cd Wan2.1
 ```
 Install dependencies:
+```sh
 # Ensure torch >= 2.4.0
 pip install -r requirements.txt
 ```
 > 💡Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution.
+Download models using huggingface-cli:
+``` sh
 pip install "huggingface_hub[cli]"
+huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B
 ```
+Download models using modelscope-cli:
+``` sh
 pip install modelscope
+modelscope download Wan-AI/Wan2.1-T2V-14B --local_dir ./Wan2.1-T2V-14B
 ```
 #### Run Image-to-Video Generation
 </table>
+##### (1) Without Prompt Extension
 - Single-GPU inference
+```sh
 python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
 - Multi-GPU inference using FSDP + xDiT USP
+```sh
 pip install "xfuser>=0.4.1"
 torchrun --nproc_per_node=8 generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
+##### (2) Using Prompt Extension
+Run with local prompt extension using `Qwen/Qwen2.5-VL-7B-Instruct`:
+```sh
 python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_model Qwen/Qwen2.5-VL-7B-Instruct --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
+Run with remote prompt extension using `dashscope`:
+```sh
 DASH_API_KEY=your_key python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B-480P --image examples/i2v_input.JPG --use_prompt_extend --prompt_extend_method 'dashscope' --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
 ```
 ##### (3) Runing local gradio
+```sh
 cd gradio
 # if one only uses 480P model in gradio
 DASH_API_KEY=your_key python i2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir_480p ./Wan2.1-I2V-14B-480P