How to Use olmOCR GGUF Model with Ollama?

by koala8104 - opened Mar 5

Mar 5

Hi,

I've downloaded the olmOCR GGUF model and added it to Ollama (running on localhost:11434), but I'm struggling to get it working properly.

Could someone share:

The correct prompt format for olmOCR with Ollama
How to convert PDFs to images and send them to the model
A simple code example showing how to use it

I've read the GitHub repo but still haven't managed to make it work.

Thanks!

Zotrix180

Mar 11

Xsilicon

Mar 11

I have the same question,is there someone can share how to use olmOCR on ollama?

ordos92

Mar 15

I also could not make it to OCR images. but for some reasons it did not work for me even as LLM. (it returned random text what indicates a wrong prompt structure)
to fix the chat behavior I did ollama create olmocr -f Modelfile
the file I used

FROM olmOCR-7B-0225-preview-Q5_K_M.gguf
TEMPLATE """{{- if .Messages }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
<|im_start|>{{ .Role }}
{{ .Content }}
{{- if $last }}
{{- if (ne .Role "assistant") }}<|im_end|>
<|im_start|>assistant
{{ end }}
{{- else }}<|im_end|>
{{ end }}
{{- end }}
{{- else }}
{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""

SYSTEM You are a helpful assistant.
PARAMETER temperature 0.1

but it is still saying that it does not see images.
I figure our vision models from ollama repos uses a projector model as second GGUF.
I tried used projector GGUF from qwen vl 7b but ollama cli said "Error: invalid file magic".

could you release the projector separately, or deploy your model into ollama, or provide a better solution?
thanks for the great ocr model.

marszhang

Apr 12

does someone find a solution?

tazomatalax

Apr 17

This would be immensely useful if it could nativly work with ollama and open web ui for testing.

amanrangapur

Ai2 org May 16

Hey guys, currently olmOCR toolkit uses SGLang which doesn't support gguf models. If you're using transformers to load the model, try using different dtype (unlikely to work because of Qwen2VLForConditionalGeneration doesn't allow int8 or float8_*'s).

willshanghai

24 days ago

FROM /root/.ollama/olmOCR-7B-0225-preview-Q5_0.gguf

TEMPLATE "<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"

PARAMETER stop <|im_start|>
PARAMETER stop <|im_end|>

SYSTEM """
You are olmOCR expert to extract content from documents to markdown format"""

{"id": "2741730785ca425ede1db569970114f7c843988a", "text": "I'm sorry, but I can't assist with that.", "source": "olmocr", "added": "2025-06-11", "created": "2025-06-11", "metadata": {"Source-File": "/local_files/\u6210\u90fd\u94f6\u884c2024\u5e74\u5ea6\u5e74\u62a5252.pdf", "olmocr-version": "0.1.71", "pdf-total-pages": 1, "total-input-tokens": 0, "total-output-tokens": 8, "total-fallback-pages": 0}, "attributes": {"pdf_page_numbers": [[0, 40, 1]]}}

Ollama gguf doesn't work either. Anything to improve? Or what is original olmOCR Modelfile?

amanrangapur

Ai2 org 23 days ago

Hey @willshanghai , gguf is not yet supported. But, please visit the github repo and change branch to jake/vllm_perf where you can run vllm instead of sglang and pass dtype inside pipeline.py file for quants.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment