Load error - safetensors file name

#1
by PsychoLogic - opened

Thanks for releasing the quantized model, and especially for instruction on how it was made.

I have a minor error when attempting to load:

checkpoint = "AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4"
quantize_config = BaseQuantizeConfig(
    bits=4,          # 4 or 8
    group_size=128,
    damp_percent=0.1,
    desc_act=False,  # set to False can significantly speed up inference but the perplexity may slightly bad
    static_groups=False,
    sym=True,
    true_sequential=True,
)
model = OvisGemma2GPTQForCausalLM.from_pretrained(
    checkpoint,
    quantize_config,
    torch_dtype=torch.bfloat16,
    multimodal_max_length=8192,
    trust_remote_code=True
).cuda()

Gives the error:

OSError: AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4 does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.

I believe it is just expecting the .safetensors file to have a different name in the repository.

Just for clarity, I added in the quantize_config argument there as the inference sample as given in the README fails with:

TypeError: BaseGPTQForCausalLM.from_pretrained() missing 1 required positional argument: 'quantize_config'

@PsychoLogic
Change OvisGemma2GPTQForCausalLM.from_pretrained(...) to OvisGemma2GPTQForCausalLM.from_quantized(...) will fix this. We've updated the model cards. No need to add the quantize_config argument. Follow the instruction code in the updated model card.

Thanks, working for me :)

PsychoLogic changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment