For all large scale experiments, we use the `vq_ds16_t2i` tokenizer from [LLaMaGen](https://github.com/FoundationVision/LlamaGen).

For small-scale/scaling experiments, we use the MagViTv2 tokenizer from [Show-o](https://github.com/showlab/Show-o).

For CUB200 experiments, we use the [TiTok](https://github.com/bytedance/1d-tokenizer) tokenizer. In experiments, we found this tokenizer to perform the best, however it was not released at the time of our earlier experiments.