Molmo-7B-D NF4 Quant Only the LLM portion was quantized, CLIP encoder remains as is

30GB -> 7GB

approx. 12GB VRAM required

base model for more information:

Safetensors

Model size

8.02B params

Tensor type

F32

F16

Model tree for reubk/Molmo_7B_D_0924_NF4

Base model

Qwen/Qwen2-7B

Finetuned

Quantized

(7)

this model