mradermacher/Camel-Doc-OCR-062825-GGUF

Nasa1423

8 days ago

Hello! Doesn't this gguf support images input? Gave it a try in ollama, and it is not working.

nicoboss

8 days ago

•

edited 8 days ago

Yes this model does support vision. For vision you need to provide booth the GGUF of the LLM layers quantized in any way you like and the vision layer as mmproj file for which you can use one of the following files:

If ollama doesn't work just use llama.cpp directly and it will work for sure.

Nasa1423

8 days ago

Thanks! Any info how to use that with vllm?

nicoboss

7 days ago

•

edited 7 days ago

Thanks! Any info how to use that with vllm?

GGUF support in vLLM is experimental and currently doesn't support vision. Generally using GGUFs on vLLM is a bit of a waste. Other quant formats such as AWQ are over 20 times faster with vLLM GPU if you have 64 parallel prompts. If you want to use GGUFs then the way to go is llama.cpp. llama.cpp is the reference implementation of the GGUF format and so usually has the most recent features and the least number of issues. With llama-server llama.cpp overs a nice GUI.

Nasa1423

6 days ago

That makes sense, thank you for your help!

Nasa1423 changed discussion status to closed 6 days ago

mradermacher
/

Camel-Doc-OCR-062825-GGUF

Image support