THUDM/cogvlm2-llama3-chat-19B-int4

Feb 12

Good evening! Thank you for releasing the quantized model. Unfortunately, I could not manage to make it work.
I downloaded your project and encountered several issues with the file modeling_cogvlm.py. In class CogVLMForCausalLM you did not declare supposedly method named "_extract_past_from_model_output" or did not import it from somewhere else (The error message is the following "'CogVLMForCausalLM' object has no attribute '_extract_past_from_model_output' "). This causes your model to fail to respond from the user. Another issue that somewhere you supposedly hardcoded tensors and I encounter the warning from bitsandbytes ("Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.").
It would be great to hear an answer to my problems

Shamonthemachine

Mar 29

I have the exact same issue. Bumb

RowenaHe

9 days ago

Hi! I run into the same problem. Have you found a way to solve this problem?

THUDM
/

cogvlm2-llama3-chat-19B-int4

Bug_report