This is a Mistral-Small-3.1-24B-Instruct-2503 quantized from a hacked-up GPTQModel that has preliminary Mistral3ForConditionalGeneration
support. There were several weird changes. Calibration was run against the flickr30k
dataset (with too few samples; may upload a version with more significant calibration soon), and thus this should be a true vision-aware quant of the Mistral Small 3.1 HF checkpoint.
You need this branch of vLLM to run: https://github.com/sjuxax/vllm/tree/Mistral3.1
Another "feature" of this version is that it was quantized with a preliminary implementation of block-diagonal Hessians (which was authored entirely by Grok3). This allowed me to compute the quantization without OOM in my 24G VRAM.
- Downloads last month
- 67
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for jeffcookio/Mistral-Small-3.1-24B-Instruct-2503-HF-gptqmodel-4b-128g
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503