README.md · TIGER-Lab/ABC-Qwen2VL-Pretrain at main

metadata

base_model: Qwen/Qwen2-VL-7B-Instruct
library_name: peft
pipeline_tag: image-text-to-text

Model Details

Pretrained adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs.

Model Sources

This model is trained on top of Qwen2VL-Instruct.

Paper and Website

For more information, please refer to Website.

Code: https://github.com/TIGER-AI-Lab/ABC

Citation

@misc{schneider2025abcachievingbettercontrol,
      title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, 
      author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
      year={2025},
      eprint={2503.00329},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.00329}, 
}