GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (both non-imatrix and imatrix)
It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).
- Downloads last month
- 8
Hardware compatibility
Log In
to view the estimation
3-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for NeoChen1024/dolphin-2.7-mixtral-8x7b-GGUF-IQ3_M
Base model
dphn/dolphin-2.7-mixtral-8x7b