Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

CVPR 2025 (Highlight)

Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M. Rehg, Sangmin Lee, Ning Zhang, Tong Xiao

This repo is the model weights for our paper "Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation".

There are four models released in this repo.

InstaManip-17B-1shot: model trained specifically for 1-shot image manipulation.
InstaManip-17B-2shot: model trained specifically for 2-shot image manipulation.
InstaManip-17B-3shot: model trained specifically for 3-shot image manipulation.
InstaManip-17B-dynamic: model trained for arbitrary amount of exemplar image pairs.

Please refer to the code on github for detailed instructions on how to use it.

If you find our paper helpful to your work, please cite with this BibTex.

@article{lai2024unleashing,
  title={Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation},
  author={Lai, Bolin and Juefei-Xu, Felix and Liu, Miao and Dai, Xiaoliang and Mehta, Nikhil and Zhu, Chenguang and Huang, Zeyi and Rehg, James M and Lee, Sangmin and Zhang, Ning and others},
  journal={arXiv preprint arXiv:2412.01027},
  year={2024}
}

bolinlai
/

InstaManip

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

CVPR 2025 (Highlight)

Dataset used to train bolinlai/InstaManip