Any way to make Vision work with ollama and openwebui?

#4
by freeaimodels - opened

Hello dear friends,

I'm kinda new to self-hosting LLM and I have a question regarding the vision capability of this model.

I'm running ollama with openwebui inside docker on a Linux server. This model, while supporting vision, seems not to work when I upload an image.
I've read that I require the mmproj GGUF for vision to work. The mmproj seems to be included here and I have downloaded it. Now the question, how can add it in ollama/openwebui to this model? The tutorials seem to be a little bit confusing. I have a "models" folder in my ollama instance. It includes two subfolders "blobs" and "manifests". How can I deploy the mmproj GGUF?

Just use llama.cpp, it has a web interface where you can upload images. I only tested it with Qwen2.5-VL-Instruct.

Just use llama.cpp, it has a web interface where you can upload images. I only tested it with Qwen2.5-VL-Instruct.

Can you please share, how can i launch this model with vision? I'm using llama cpp python like this

python -m llama_cpp.server --model app/mlabonne_gemma-3-27b-it-abliterated.q4_k_m.gguf --n_gpu_layers -1 --n_ctx 8192 --host 0.0.0.0 --clip_model_path app/mmproj-model-f16.gguf
Then i'm using Open Webui
devtool shows that message sent like

{
"role": "user",
"content": [
{
"type": "text",
"text": "What can you see here?"
},
{
"type": "image_url",
"image_url": {
"url": "...
}

But model "can't see any image" It's just answering the text...

You need to download both files , put them in a new empty folder. Then create a Modelfile which can just have this:

FROM .

Then open terminal in that folder and do: ollama create YOUR_NAME_OF_MODEL:latest -f Modelfile

Unfortunately, even though I merged the mmproj file and the q4_k_m model with Ollama, the ComfyUI Ollama Vision loaders say that they do not recognize the image. :(

Sign up or log in to comment