Load error - safetensors file name
#1
by
PsychoLogic
- opened
Thanks for releasing the quantized model, and especially for instruction on how it was made.
I have a minor error when attempting to load:
checkpoint = "AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4"
quantize_config = BaseQuantizeConfig(
bits=4, # 4 or 8
group_size=128,
damp_percent=0.1,
desc_act=False, # set to False can significantly speed up inference but the perplexity may slightly bad
static_groups=False,
sym=True,
true_sequential=True,
)
model = OvisGemma2GPTQForCausalLM.from_pretrained(
checkpoint,
quantize_config,
torch_dtype=torch.bfloat16,
multimodal_max_length=8192,
trust_remote_code=True
).cuda()
Gives the error:
OSError: AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4 does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.
I believe it is just expecting the .safetensors file to have a different name in the repository.
Just for clarity, I added in the quantize_config
argument there as the inference sample as given in the README fails with:
TypeError: BaseGPTQForCausalLM.from_pretrained() missing 1 required positional argument: 'quantize_config'
@PsychoLogic
Change OvisGemma2GPTQForCausalLM.from_pretrained(...)
to OvisGemma2GPTQForCausalLM.from_quantized(...)
will fix this. We've updated the model cards. No need to add the quantize_config
argument. Follow the instruction code in the updated model card.
Thanks, working for me :)
PsychoLogic
changed discussion status to
closed