xiguan97 commited on
Commit
7f72b34
·
verified ·
1 Parent(s): d19ef79

Update README.md (#27)

Browse files

- Update README.md (a9476a60303425fd230e64d79c6f4f9f2d43c427)

Files changed (1) hide show
  1. README.md +31 -16
README.md CHANGED
@@ -37,11 +37,13 @@ library_name: MAGI-1
37
 
38
  # MAGI-1: Autoregressive Video Generation at Scale
39
 
40
- This repository contains the code for the MAGI-1 model, pre-trained weights and inference code. You can find more information on our [technical report](https://static.magi.world/static/files/MAGI_1.pdf) or directly create magic with MAGI-1 [here](http://sand.ai) . 🚀✨
41
 
42
 
43
  ## 🔥🔥🔥 Latest News
44
 
 
 
45
  - Apr 21, 2025: MAGI-1 is here 🎉. We've released the model weights and inference code — check it out!
46
 
47
 
@@ -79,34 +81,41 @@ We adopt a shortcut distillation approach that trains a single velocity-based mo
79
 
80
  We provide the pre-trained weights for MAGI-1, including the 24B and 4.5B models, as well as the corresponding distill and distill+quant models. The model weight links are shown in the table.
81
 
82
- | Model | Link | Recommend Machine |
83
- | ----------------------------- | ------------------------------------------------------------ | ------------------------------- |
84
- | T5 | [T5](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/t5) | - |
85
- | MAGI-1-VAE | [MAGI-1-VAE](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/vae) | - |
86
- | MAGI-1-24B | [MAGI-1-24B](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_base) | H100/H800 \* 8 |
87
- | MAGI-1-24B-distill | [MAGI-1-24B-distill](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_distill) | H100/H800 \* 8 |
88
- | MAGI-1-24B-distill+fp8_quant | [MAGI-1-24B-distill+quant](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_distill_quant) | H100/H800 \* 4 or RTX 4090 \* 8 |
89
- | MAGI-1-4.5B | MAGI-1-4.5B | RTX 4090 \* 1 |
 
 
 
 
 
 
90
 
91
  ## 4. Evaluation
92
 
93
  ### In-house Human Evaluation
94
 
95
- MAGI-1 achieves state-of-the-art performance among open-source models (surpassing Wan-2.1 and significantly outperforming Hailuo and HunyuanVideo), particularly excelling in instruction following and motion quality, positioning it as a strong potential competitor to closed-source commercial models such as Kling.
96
 
97
  ![inhouse human evaluation](figures/inhouse_human_evaluation.png)
98
 
99
  ### Physical Evaluation
100
 
101
- Thanks to the natural advantages of autoregressive architecture, Magi achieves far superior precision in predicting physical behavior through video continuation—significantly outperforming all existing models.
102
 
103
  | Model | Phys. IQ Score ↑ | Spatial IoU ↑ | Spatio Temporal ↑ | Weighted Spatial IoU ↑ | MSE ↓ |
104
  |----------------|------------------|---------------|-------------------|-------------------------|--------|
105
  | **V2V Models** | | | | | |
106
- | **Magi (V2V)** | **56.02** | **0.367** | **0.270** | **0.304** | **0.005** |
 
107
  | VideoPoet (V2V)| 29.50 | 0.204 | 0.164 | 0.137 | 0.010 |
108
  | **I2V Models** | | | | | |
109
- | **Magi (I2V)** | **30.23** | **0.203** | **0.151** | **0.154** | **0.012** |
110
  | Kling1.6 (I2V) | 23.64 | 0.197 | 0.086 | 0.144 | 0.025 |
111
  | VideoPoet (I2V)| 20.30 | 0.141 | 0.126 | 0.087 | 0.012 |
112
  | Gen 3 (I2V) | 22.80 | 0.201 | 0.115 | 0.116 | 0.015 |
@@ -144,7 +153,7 @@ pip install -r requirements.txt
144
  # Install ffmpeg
145
  conda install -c conda-forge ffmpeg=4.4
146
 
147
- # Install MagiAttention, for more information, please refer to https://github.com/SandAI-org/MagiAttention#
148
  git clone [email protected]:SandAI-org/MagiAttention.git
149
  cd MagiAttention
150
  git submodule update --init --recursive
@@ -198,6 +207,12 @@ By adjusting these parameters, you can flexibly control the input and output to
198
 
199
  ### Some Useful Configs (for config.json)
200
 
 
 
 
 
 
 
201
  | Config | Help |
202
  | -------------- | ------------------------------------------------------------ |
203
  | seed | Random seed used for video generation |
@@ -205,7 +220,7 @@ By adjusting these parameters, you can flexibly control the input and output to
205
  | video_size_w | Width of the video |
206
  | num_frames | Controls the duration of generated video |
207
  | fps | Frames per second, 4 video frames correspond to 1 latent_frame |
208
- | cfg_number | Base model uses cfg_number==2, distill and quant model uses cfg_number=1 |
209
  | load | Directory containing a model checkpoint. |
210
  | t5_pretrained | Path to load pretrained T5 model |
211
  | vae_pretrained | Path to load pretrained VAE model |
@@ -230,4 +245,4 @@ If you find our code or model useful in your research, please cite:
230
 
231
  ## 8. Contact
232
 
233
- If you have any questions, please feel free to raise an issue or contact us at [support@sand.ai](support@sand.ai) .
 
37
 
38
  # MAGI-1: Autoregressive Video Generation at Scale
39
 
40
+ This repository contains the [code](https://github.com/SandAI-org/MAGI-1) for the MAGI-1 model, pre-trained weights and inference code. You can find more information on our [technical report](https://static.magi.world/static/files/MAGI_1.pdf) or directly create magic with MAGI-1 [here](http://sand.ai) . 🚀✨
41
 
42
 
43
  ## 🔥🔥🔥 Latest News
44
 
45
+ - Apr 30, 2025: MAGI-1 4.5B distill and distill+quant models are coming soon 🎉 — we’re putting on the final touches, stay tuned!
46
+ - Apr 30, 2025: MAGI-1 4.5B model has been released 🎉. We've updated the model weights — check it out!
47
  - Apr 21, 2025: MAGI-1 is here 🎉. We've released the model weights and inference code — check it out!
48
 
49
 
 
81
 
82
  We provide the pre-trained weights for MAGI-1, including the 24B and 4.5B models, as well as the corresponding distill and distill+quant models. The model weight links are shown in the table.
83
 
84
+ | Model | Link | Recommend Machine |
85
+ | ------------------------------ | -------------------------------------------------------------------- | ------------------------------- |
86
+ | T5 | [T5](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/t5) | - |
87
+ | MAGI-1-VAE | [MAGI-1-VAE](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/vae) | - |
88
+ | MAGI-1-24B | [MAGI-1-24B](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_base) | H100/H800 × 8 |
89
+ | MAGI-1-24B-distill | [MAGI-1-24B-distill](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_distill) | H100/H800 × 8 |
90
+ | MAGI-1-24B-distill+fp8_quant | [MAGI-1-24B-distill+quant](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/24B_distill_quant) | H100/H800 × 4 or RTX 4090 × 8 |
91
+ | MAGI-1-4.5B | [MAGI-1-4.5B](https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi/4.5B_base) | RTX 4090 × 1 |
92
+ | MAGI-1-4.5B-distill | Coming soon | RTX 4090 × 1 |
93
+ | MAGI-1-4.5B-distill+fp8_quant | Coming soon | RTX 4090 × 1 |
94
+
95
+ > [!NOTE]
96
+ >
97
+ > For 4.5B models, any machine with at least 24GB of GPU memory is sufficient.
98
 
99
  ## 4. Evaluation
100
 
101
  ### In-house Human Evaluation
102
 
103
+ MAGI-1 achieves state-of-the-art performance among open-source models like Wan-2.1 and HunyuanVideo and closed-source model like Hailuo (i2v-01), particularly excelling in instruction following and motion quality, positioning it as a strong potential competitor to closed-source commercial models such as Kling.
104
 
105
  ![inhouse human evaluation](figures/inhouse_human_evaluation.png)
106
 
107
  ### Physical Evaluation
108
 
109
+ Thanks to the natural advantages of autoregressive architecture, Magi achieves far superior precision in predicting physical behavior on the [Physics-IQ benchmark](https://github.com/google-deepmind/physics-IQ-benchmark) through video continuation—significantly outperforming all existing models.
110
 
111
  | Model | Phys. IQ Score ↑ | Spatial IoU ↑ | Spatio Temporal ↑ | Weighted Spatial IoU ↑ | MSE ↓ |
112
  |----------------|------------------|---------------|-------------------|-------------------------|--------|
113
  | **V2V Models** | | | | | |
114
+ | **Magi-24B (V2V)** | **56.02** | **0.367** | **0.270** | **0.304** | **0.005** |
115
+ | **Magi-4.5B (V2V)** | **42.44** | **0.234** | **0.285** | **0.188** | **0.007** |
116
  | VideoPoet (V2V)| 29.50 | 0.204 | 0.164 | 0.137 | 0.010 |
117
  | **I2V Models** | | | | | |
118
+ | **Magi-24B (I2V)** | **30.23** | **0.203** | **0.151** | **0.154** | **0.012** |
119
  | Kling1.6 (I2V) | 23.64 | 0.197 | 0.086 | 0.144 | 0.025 |
120
  | VideoPoet (I2V)| 20.30 | 0.141 | 0.126 | 0.087 | 0.012 |
121
  | Gen 3 (I2V) | 22.80 | 0.201 | 0.115 | 0.116 | 0.015 |
 
153
  # Install ffmpeg
154
  conda install -c conda-forge ffmpeg=4.4
155
 
156
+ # For GPUs based on the Hopper architecture (e.g., H100/H800), it is recommended to install MagiAttention(https://github.com/SandAI-org/MagiAttention) for acceleration. For non-Hopper GPUs, installing MagiAttention is not necessary.
157
  git clone [email protected]:SandAI-org/MagiAttention.git
158
  cd MagiAttention
159
  git submodule update --init --recursive
 
207
 
208
  ### Some Useful Configs (for config.json)
209
 
210
+ > [!NOTE]
211
+ >
212
+ > - If you are running 24B model with RTX 4090 \* 8, please set `pp_size:2 cp_size: 4`.
213
+ >
214
+ > - Our model supports arbitrary resolutions. To accelerate inference process, the default resolution for the 4.5B model is set to 720×720 in the `4.5B_config.json`.
215
+
216
  | Config | Help |
217
  | -------------- | ------------------------------------------------------------ |
218
  | seed | Random seed used for video generation |
 
220
  | video_size_w | Width of the video |
221
  | num_frames | Controls the duration of generated video |
222
  | fps | Frames per second, 4 video frames correspond to 1 latent_frame |
223
+ | cfg_number | Base model uses cfg_number==3, distill and quant model uses cfg_number=1 |
224
  | load | Directory containing a model checkpoint. |
225
  | t5_pretrained | Path to load pretrained T5 model |
226
  | vae_pretrained | Path to load pretrained VAE model |
 
245
 
246
  ## 8. Contact
247
 
248
+ If you have any questions, please feel free to raise an issue or contact us at [research@sand.ai](mailto:research@sand.ai) .