Diffusers
Safetensors
x-omni
custom_code

X-Omni-En (support English text rendering)

🏠 Project Page | πŸ“„ Paper | πŸ’»β€‹ Code | πŸš€ HuggingFace Space

🌟 Highlights

  • Unified Modeling Approach: A discrete autoregressive model handling image and language modalities.
  • Superior Instruction Following: Exceptional capability to follow complex instructions.
  • Superior Text Rendering: Accurately render text in English.
  • Arbitrary resolutions: Produces aesthetically pleasing images at arbitrary resolutions.

πŸ“– Citation

If you find this project helpful for your research or use it in your own work, please cite our paper:

@article{geng2025xomni,
      author       = {Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, Linus, Di Wang and Jie Jiang},
      title        = {X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again},
      journal      = {CoRR},
      volume       = {abs/2507.22058},
      year         = {2025},
}
Downloads last month
2,291
Safetensors
Model size
9.6B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using X-Omni/X-Omni-En 1

Collection including X-Omni/X-Omni-En