UNIC-Adapter: Unified Image-Instruction Adapter for Multimodal Image Generation

UNIC-Adapter is a unified image-instruction adapter that integrates multimodal instructions for controllable image generation. This model card hosts the official models for the CVPR 2025 paper "UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation".

On this model card, we release a model based on SD3 Medium, which supports the tasks described in our paper. In addition, we also provide two additional models: one built on SD3.5 Medium, which is capable of traditional computer vision perception tasks, and another on FLUX.1-dev, which supports both instruction-based image editing and traditional computer vision perception tasks.

Generated samples

Pixel-level Control

(Left: Condition image, Center left: SD3 Medium with UNIC-Adapter, Center right: SD3.5 Medium with UNIC-Adapter, Right: FLUX.1-dev with UNIC-Adapter)

Subject-driven Generation

(Left: Condition image, Center left: SD3 Medium with UNIC-Adapter, Center right: SD3.5 Medium with UNIC-Adapter, Right: FLUX.1-dev with UNIC-Adapter)

(Left: condition image, Center: SD 3.5 Medium with UNIC-Adapter, Right: FLUX.1-dev with UNIC-Adapter)

Style-driven Generation

(Left: Condition image, Center left: SD3 Medium with UNIC-Adapter, Center right: SD3.5 Medium with UNIC-Adapter, Right: FLUX.1-dev with UNIC-Adapter)

Image Understanding

(Left: Source image, Center: SD3.5 Medium with UNIC-Adapter, Right: FLUX.1-dev with UNIC-Adapter)

Image Editing

(Left: Source image, Right: FLUX.1-dev with UNIC-Adapter)

License

This project is licensed under the MIT License (SPDX-License-Identifier: MIT). The models cannot be used independently. If you use our model in conjunction with the Flux model, you must review the FLUX.1 [dev] Non-Commercial License of the Flux model and comply with all of its terms; If you use our model in conjunction with the stable-diffusion-3-medium model, then you must review the STABILITY AI COMMUNITY LICENSE AGREEMENT of the SD3 model and comply with all of its terms; If you use our model in conjunction with the stable-diffusion-3.5-medium model, then you must review the STABILITY AI COMMUNITY LICENSE AGREEMENT of the SD3.5 model and comply with all of its terms.

Citation

If you find this repo is helpful for your research, please cite our paper:

@inproceedings{duan2025unic,
  title={UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation},
  author={Duan, Lunhao and Zhao, Shanshan and Yan, Wenjun and Li, Yinglun and Chen, Qing-Guo and Xu, Zhao and Luo, Weihua and Zhang, Kaifu and Gong, Mingming and Xia, Gui-Song},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={7963--7973},
  year={2025}
}

Disclaimer

We used compliance checking algorithms during the training process, to ensure the compliance of the trained model(s) to the best of our ability. Due to complex data and the diversity of language model usage scenarios, we cannot guarantee that the model is completely free of copyright issues or improper content. If you believe anything infringes on your rights or generates improper content, please contact us, and we will promptly address the matter.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AIDC-AI/UNIC-Adapter

Base model

black-forest-labs/FLUX.1-dev

Finetuned

(568)

this model

Paper for AIDC-AI/UNIC-Adapter

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Paper • 2412.18928 • Published Dec 25, 2024 • 1