---
license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE
language:
- en
pipeline_tag: image-text-to-text
tags:
- multimodal
base_model:
- Qwen/Qwen2.5-VL-72B-Instruct
---

# Qwen2.5-VL-72B-Instruct

Converted and quantized using [HimariO's
fork](https://github.com/HimariO/llama.cpp.qwen2vl/tree/qwen25-vl) using [this
procedure](https://github.com/ggml-org/llama.cpp/issues/11483#issuecomment-2727577078).
No IMatrix.

The fork is currently required to run inference and there's no guarantee these checkpoints will work with future builds. Temporary builds are available [here](https://github.com/green-s/llama.cpp.qwen2vl/releases). The latest tested build as of writing is `qwen25-vl-b4899-bc4163b`.

Edit:

As of 1-April-2025 inference support has been added to [koboldcpp](https://github.com/LostRuins/koboldcpp).

[Original model](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct)

## Usage

```bash
./llama-qwen2vl-cli -m Qwen2.5-VL-72B-Instruct-Q4_K_M.gguf --mmproj qwen2.5-vl-72b-instruct-vision-f16.gguf -p "Please describe this image." --image ./image.jpg
```