--- license: other license_name: qwen license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE language: - en pipeline_tag: image-text-to-text tags: - multimodal base_model: - Qwen/Qwen2.5-VL-72B-Instruct --- # Qwen2.5-VL-72B-Instruct Converted and quantized using [HimariO's fork](https://github.com/HimariO/llama.cpp.qwen2vl/tree/qwen25-vl) using [this procedure](https://github.com/ggml-org/llama.cpp/issues/11483#issuecomment-2727577078). No IMatrix. The fork is currently required to run inference and there's no guarantee these checkpoints will work with future builds. Temporary builds are available [here](https://github.com/green-s/llama.cpp.qwen2vl/releases). The latest tested build as of writing is `qwen25-vl-b4899-bc4163b`. Edit: As of 1-April-2025 inference support has been added to [koboldcpp](https://github.com/LostRuins/koboldcpp). [Original model](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct) ## Usage ```bash ./llama-qwen2vl-cli -m Qwen2.5-VL-72B-Instruct-Q4_K_M.gguf --mmproj qwen2.5-vl-72b-instruct-vision-f16.gguf -p "Please describe this image." --image ./image.jpg ```