metadata
base_model: Qwen/Qwen2-VL-7B-Instruct
library_name: peft
pipeline_tag: image-text-to-text
Model Details
Pretrained adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs.
Model Sources
This model is trained on top of Qwen2VL-Instruct.
Paper and Website
For more information, please refer to Website.
Code: https://github.com/TIGER-AI-Lab/ABC
Citation
@misc{schneider2025abcachievingbettercontrol,
title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs},
author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
year={2025},
eprint={2503.00329},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.00329},
}