--- license: afl-3.0 --- # Hi-MAR
π₯οΈ GitHub    ο½    π Project Page    |   π€ Hugging Face   |    π Paper    |    π PDF   
[**Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots**](https://Tom-zgt.github.io/Hi-MAR-page/) (ICML 2025)Environment Requirement
Clone the repo:
```
git clone https://github.com/HiDream-ai/himar.git
cd himar
```
Install dependencies:
```
conda env create -f environment.yaml
conda activate himar
```
Model Download
Download VAE from the [link](https://www.dropbox.com/scl/fi/hhmuvaiacrarfg28qxhwz/kl16.ckpt?rlkey=l44xipsezc8atcffdp4q7mwmh&dl=0) in the [MAR Github](https://github.com/LTH14/mar/).
You can download our pre-trained Hi-MAR models directly from the links provided here.
| Models | FID-50K | Inception Score | #params |
| ------------------------------------------------------------ | ------- | --------------- | ------- |
| [Hi-MAR-B](https://huggingface.co/HiDream-ai/Hi-MAR/blob/main/Hi-MAR-B/checkpoint-last.pth) | 1.93 | 293.0 | 244M |
| [Hi-MAR-L](https://huggingface.co/HiDream-ai/Hi-MAR/blob/main/Hi-MAR-L/checkpoint-last.pth) | 1.66 | 322.3 | 529M |
| [Hi-MAR-H](https://huggingface.co/HiDream-ai/Hi-MAR/blob/main/Hi-MAR-H/checkpoint-last.pth) | 1.52 | 322.78 | 1090M |
Evaluation
Evaluate Hi-MAR-B on ImageNet256x256:
```
torchrun --nproc_per_node=8 --nnodes=1 main_himar.py --img_size 256 --vae_path /path/to/vae --vae_embed_dim 16 --vae_stride 16 --patch_size 1 --model himar_base --diffloss_d 6 --diffloss_w 1024 --output_dir ./himar_base_test --resume /path/to/Hi-MAR-B --num_images 50000 --num_iter 4 --cfg 2.5 --re_cfg 2.7 --cfg_schedule linear --cond_scale 8 --cond_dim 16 --two_diffloss --global_dm --gdm_d 6 --gdm_w 512 --eval_bsz 256 --load_epoch -1 --head 8 --ratio 4 --cos --evaluate
```
Evaluate Hi-MAR-L on ImageNet256x256:
```
torchrun --nproc_per_node=8 --nnodes=1 main_himar.py --img_size 256 --vae_path /path/to/vae --vae_embed_dim 16 --vae_stride 16 --patch_size 1 --model himar_large --diffloss_d 8 --diffloss_w 1280 --output_dir ./himar_large_test --resume /path/to/Hi-MAR-L --num_images 50000 --num_iter 4 --cfg 3.5 --re_cfg 3.5 --cfg_schedule linear --cond_scale 8 --cond_dim 16 --two_diffloss --global_dm --gdm_d 8 --gdm_w 512 --eval_bsz 256 --load_epoch -1 --head 8 --ratio 4 --cos --evaluate
```
Evaluate Hi-MAR-H on ImageNet256x256:
```
torchrun --nproc_per_node=8 --nnodes=1 main_himar.py --img_size 256 --vae_path /path/to/vae --vae_embed_dim 16 --vae_stride 16 --patch_size 1 --model himar_huge --diffloss_d 12 --diffloss_w 1536 --output_dir ./himar_huge_test --resume /path/to/Hi-MAR-H --num_images 50000 --num_iter 12 --cfg 3.2 --re_cfg 5.5 --cfg_schedule linear --cond_scale 8 --cond_dim 16 --two_diffloss --global_dm --gdm_d 12 --gdm_w 768 --eval_bsz 256 --load_epoch -1 --head 12 --ratio 4 --cos --evaluate
```