huihui-ai/MAI-DS-R1-GGUF

This model converted from microsoft/MAI-DS-R1 to GGUF, Even GPUs with a minimum memory of 8 GB can try it.

GGUF: Q2_K, Q3_K, Q4_K_M, Q8_0 all support.

BF16 to f16.gguf

  1. Download microsoft/MAI-DS-R1 model, requires approximately 1.21TB of space.
cd /home/admin/models
huggingface-cli download microsoft/MAI-DS-R1 --local-dir ./microsoft/MAI-DS-R1
  1. Use the llama.cpp conversion program to convert MAI-DS-R1 to gguf format, requires an additional approximately 1.22 TB of space.
python convert_hf_to_gguf.py /home/admin/models/microsoft/MAI-DS-R1 --outfile /home/admin/models/microsoft/MAI-DS-R1/ggml-model-f16.gguf --outtype f16
  1. Use the llama.cpp quantitative program to quantitative model (llama-quantize needs to be compiled.), other quant option. Convert first Q2_K, requires an additional approximately 227 GB of space.
llama-quantize /home/admin/models/microsoft/MAI-DS-R1/ggml-model-f16.gguf  /home/admin/models/microsoft/MAI-DS-R1/ggml-model-Q2_K.gguf Q2_K
  1. Use llama-cli to test.
    -ngl, --gpu-layers, --n-gpu-layers N number of layers to store in VRAM
    -ngl 1, Even GPUs with a minimum memory of 8 GB
llama-cli -ngl 1 -m /home/admin/models/microsoft/MAI-DS-R1/ggml-model-Q2_K.gguf
  1. Q2_K has already been uploaded.
llama-gguf-split --merge Q2_K-GGUF/MAI-DS-R1-Q2_K-00001-of-00059.gguf MAI-DS-R1-Q2_K.gguf

Q3_K has already been uploaded.

llama-gguf-split --merge Q3_K-GGUF/MAI-DS-R1-Q3_K-00001-of-00088.gguf MAI-DS-R1-Q3_K.gguf

Q4_K_M has already been uploaded.

llama-gguf-split --merge Q4_K_M-GGUF/MAI-DS-R1-Q4_K_M-00001-of-00102.gguf MAI-DS-R1-Q4_K_M.gguf

Donation

If you like it, please click 'like' and follow us for more updates.
You can follow x.com/support_huihui to get the latest model information from huihui.ai.

Your donation helps us continue our further development and improvement, a cup of coffee can do it.
  • bitcoin(BTC):
  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
Downloads last month
123
GGUF
Model size
671B params
Architecture
deepseek2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for huihui-ai/MAI-DS-R1-GGUF

Quantized
(3)
this model