huihui-ai/MAI-DS-R1-GGUF
This model converted from microsoft/MAI-DS-R1 to GGUF, Even GPUs with a minimum memory of 8 GB can try it.
GGUF: Q2_K, Q3_K, Q4_K_M, Q8_0 all support.
BF16 to f16.gguf
- Download microsoft/MAI-DS-R1 model, requires approximately 1.21TB of space.
cd /home/admin/models
huggingface-cli download microsoft/MAI-DS-R1 --local-dir ./microsoft/MAI-DS-R1
- Use the llama.cpp conversion program to convert MAI-DS-R1 to gguf format, requires an additional approximately 1.22 TB of space.
python convert_hf_to_gguf.py /home/admin/models/microsoft/MAI-DS-R1 --outfile /home/admin/models/microsoft/MAI-DS-R1/ggml-model-f16.gguf --outtype f16
- Use the llama.cpp quantitative program to quantitative model (llama-quantize needs to be compiled.), other quant option. Convert first Q2_K, requires an additional approximately 227 GB of space.
llama-quantize /home/admin/models/microsoft/MAI-DS-R1/ggml-model-f16.gguf /home/admin/models/microsoft/MAI-DS-R1/ggml-model-Q2_K.gguf Q2_K
- Use llama-cli to test.
-ngl, --gpu-layers, --n-gpu-layers N number of layers to store in VRAM
-ngl 1, Even GPUs with a minimum memory of 8 GB
llama-cli -ngl 1 -m /home/admin/models/microsoft/MAI-DS-R1/ggml-model-Q2_K.gguf
- Q2_K has already been uploaded.
llama-gguf-split --merge Q2_K-GGUF/MAI-DS-R1-Q2_K-00001-of-00059.gguf MAI-DS-R1-Q2_K.gguf
Q3_K has already been uploaded.
llama-gguf-split --merge Q3_K-GGUF/MAI-DS-R1-Q3_K-00001-of-00088.gguf MAI-DS-R1-Q3_K.gguf
Q4_K_M has already been uploaded.
llama-gguf-split --merge Q4_K_M-GGUF/MAI-DS-R1-Q4_K_M-00001-of-00102.gguf MAI-DS-R1-Q4_K_M.gguf
Donation
If you like it, please click 'like' and follow us for more updates.
You can follow x.com/support_huihui to get the latest model information from huihui.ai.
Your donation helps us continue our further development and improvement, a cup of coffee can do it.
- bitcoin(BTC):
bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
- Downloads last month
- 123
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support