2025-01-09 does not run on the CPU with the example config

#50

by KeilahElla - opened Jan 12

Discussion

KeilahElla

Jan 12

If I try to run the example code in the README on the CPU, I get the following errors:

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

It looks like it tries to run fp16 on the cpu, which pytorch still does not support. Overriding in the following way did not solve the issue:

model = AutoModelForCausalLM.from_pretrained(
model_id, trust_remote_code=True, revision=revision, torch_dtype=torch.float32
)

Any suggestions?

KeilahElla

Jan 17

Update: there were a lot of float16 tensors in the inference code, which made it impossible to run the lateast version of moondream2 on the cpu.

I have replaced them with float32 types and it runs great now on the cpu. It's really fast even on my low-spec laptop.

If @vikhyatk is interested, I can send my code.

KeilahElla

Feb 4

•

edited Feb 4

Final update on this. The problem was that the local version of PyTorch I'm using (2.1) was ancient and did not support fp16 math on the CPU. Modern versions (like 2.6) support fp16 arithmetic on the CPU. If you run into this problem, just update your PyTorch to 2.6+

KeilahElla changed discussion status to closed Feb 4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment