--- license: apache-2.0 base_model: - black-forest-labs/FLUX.1-dev pipeline_tag: text-to-image tags: - LoRA - personalization - multi-subject --- # XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation This repository contains the official model of the paper [XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation](https://arxiv.org/abs/2506.21416).

![XVerse's capability in single/multi-subject personalization and semantic attribute control (pose, style, lighting)](https://github.com/bytedance/XVerse/raw/main/sample/first_page.png) ## Introduction **XVerse** introduces a novel approach to multi-subject image synthesis, offering **precise and independent control over individual subjects** without disrupting the overall image latents or features. We achieve this by transforming reference images into offsets for token-specific text-stream modulation. This innovation enables high-fidelity, editable image generation where you can robustly control both **individual subject characteristics** (identity) and their **semantic attributes**. XVerse significantly enhances capabilities for personalized and complex scene generation. ## How to Use see https://github.com/bytedance/XVerse Where to send questions or comments about the model: https://github.com/bytedance/XVerse/issues ## Citation If XVerse is helpful, please help to ⭐ the repo. If you find this project useful for your research, please consider citing our paper: ```bibtex @article{chen2025xverse, title={XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation}, author={Chen, Bowen and Zhao, Mengyi and Sun, Haomiao and Chen, Li and Wang, Xu and Du, Kang and Wu, Xinglong}, journal={arXiv preprint arXiv:2506.21416}, year={2025} } ```