3bit GPTQ quants for Mistral-Large-Instruct-2411

#2
by dazipe - opened

Do you by any chance have the 3 bit GPTQ quants of Mistral-Large-Instruct-2411
I have 2 x MI100 with 64GB VRAM total and your Mistral-Large-Instruct-2411-IQ3_XXS.gguf works great.
But it is slow with vLLM especially in batched mode.

dazipe changed discussion title from 3bit QPTQ quants for Mistral-Large-Instruct-2411 to 3bit GPTQ quants for Mistral-Large-Instruct-2411

Sign up or log in to comment