File size: 2,486 Bytes
c39062c
 
be7af3a
 
 
 
 
 
 
c39062c
50c5a98
c39062c
50c5a98
c39062c
50c5a98
 
 
 
 
 
 
 
 
 
 
 
 
 
cc3a889
50c5a98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
license: apache-2.0
base_model:
- black-forest-labs/FLUX.1-dev
pipeline_tag: text-to-image
tags:
- LoRA
- personalization
- multi-subject
---
# XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

This repository contains the official model of the paper [XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation](https://arxiv.org/abs/2506.21416).

<p align="center">
    <a href="https://arxiv.org/abs/2506.21416">
            <img alt="Build" src="https://img.shields.io/badge/arXiv%20paper-2506.21416-b31b1b.svg">
    </a>
    <a href="https://bytedance.github.io/XVerse/">
        <img alt="Project Page" src="https://img.shields.io/badge/Project-Page-blue">
    </a>
    <a href="https://github.com/bytedance/XVerse">
        <img alt="Github" src="https://img.shields.io/badge/GitHub-Code-darkgreen.svg?logo=github">
    </a>
    <a href="https://huggingface.co/ByteDance/XVerse">
        <img alt="Build" src="https://img.shields.io/badge/🤗-HF%20Model-yellow">
    </a>    
</p>

![XVerse's capability in single/multi-subject personalization and semantic attribute control (pose, style, lighting)](https://github.com/bytedance/XVerse/raw/main/sample/first_page.png)

## Introduction

**XVerse** introduces a novel approach to multi-subject image synthesis, offering **precise and independent control over individual subjects** without disrupting the overall image latents or features. We achieve this by transforming reference images into offsets for token-specific text-stream modulation.

This innovation enables high-fidelity, editable image generation where you can robustly control both **individual subject characteristics** (identity) and their **semantic attributes**. XVerse significantly enhances capabilities for personalized and complex scene generation.

## How to Use
see https://github.com/bytedance/XVerse

Where to send questions or comments about the model: https://github.com/bytedance/XVerse/issues

## Citation
If XVerse is helpful, please help to ⭐ the repo.

If you find this project useful for your research, please consider citing our paper:
```bibtex
@article{chen2025xverse,
  title={XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation},
  author={Chen, Bowen and Zhao, Mengyi and Sun, Haomiao and Chen, Li and Wang, Xu and Du, Kang and Wu, Xinglong},
  journal={arXiv preprint arXiv:2506.21416},
  year={2025}
}
```