GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (both non-imatrix and imatrix)
It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).

Downloads last month: 8

GGUF

Model size

46.7B params

Architecture

llama

Hardware compatibility

3-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeoChen1024/dolphin-2.7-mixtral-8x7b-GGUF-IQ3_M

Base model

dphn/dolphin-2.7-mixtral-8x7b

Quantized

(8)

this model