3bit GPTQ quants for Mistral-Large-Instruct-2411

by dazipe - opened Mar 14

Mar 14

•

Do you by any chance have the 3 bit GPTQ quants of Mistral-Large-Instruct-2411
I have 2 x MI100 with 64GB VRAM total and your Mistral-Large-Instruct-2411-IQ3_XXS.gguf works great.
But it is slow with vLLM especially in batched mode.

dazipe changed discussion title from 3bit QPTQ quants for Mistral-Large-Instruct-2411 to 3bit GPTQ quants for Mistral-Large-Instruct-2411 Mar 14

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment