ByteDance-Seed
/

Tar-1.5B

Model card Files Files and versions Community

Unifying Visual Understanding and Generation via Text-Aligned Representations

Jiaming Han, Hao Chen^†, Yang Zhao, Hanyu Wang, Qi Zhao, Ziyan Yang, Hao He, Xiangyu Yue^‡, Lu Jiang^‡

^† Project Lead ^‡ Corresponding Authors

Citation

@article{han2025tar,
  title={Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations}, 
  author={Han, Jiaming and Chen, Hao and Zhao, Yang and Wang, Hanyu and Zhao, Qi and Yang, Ziyan and He, Hao and Yue, Xiangyu and Jiang, Lu},
  journal={arXiv preprint arXiv:2506.18898},
  year={2025},
}

License

This project is licensed under the Apache 2.0 License.

Downloads last month: 11

Safetensors

Model size

2.57B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ByteDance-Seed/Tar-1.5B

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Finetuned

(929)

this model

Spaces using ByteDance-Seed/Tar-1.5B 2

Collection including ByteDance-Seed/Tar-1.5B

Tar

Unifying Visual Understanding and Generation via Text-Aligned Representations • 5 items • Updated 1 day ago • 1