Load error - safetensors file name

by PsychoLogic - opened Nov 7, 2024

Nov 7, 2024

Thanks for releasing the quantized model, and especially for instruction on how it was made.

I have a minor error when attempting to load:

checkpoint = "AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4"
quantize_config = BaseQuantizeConfig(
    bits=4,          # 4 or 8
    group_size=128,
    damp_percent=0.1,
    desc_act=False,  # set to False can significantly speed up inference but the perplexity may slightly bad
    static_groups=False,
    sym=True,
    true_sequential=True,
)
model = OvisGemma2GPTQForCausalLM.from_pretrained(
    checkpoint,
    quantize_config,
    torch_dtype=torch.bfloat16,
    multimodal_max_length=8192,
    trust_remote_code=True
).cuda()

Gives the error:

OSError: AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4 does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.

I believe it is just expecting the .safetensors file to have a different name in the repository.

PsychoLogic

Nov 8, 2024

Just for clarity, I added in the quantize_config argument there as the inference sample as given in the README fails with:

TypeError: BaseGPTQForCausalLM.from_pretrained() missing 1 required positional argument: 'quantize_config'

TryingHard

AIDC-AI org Nov 8, 2024

•

edited Nov 8, 2024

@PsychoLogic
Change OvisGemma2GPTQForCausalLM.from_pretrained(...) to OvisGemma2GPTQForCausalLM.from_quantized(...) will fix this. We've updated the model cards. No need to add the quantize_config argument. Follow the instruction code in the updated model card.

PsychoLogic

Nov 8, 2024

Thanks, working for me :)

PsychoLogic changed discussion status to closed Nov 8, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment