ThomasBaruzier
/

DeepScaleR-1.5B-Preview-GGUF

GGUF

conversational

Model card Files Files and versions Community

ThomasBaruzier commited on Feb 27

Commit

c194c4a

verified ·

1 Parent(s): 2655aa0

Create README.md

Browse files

Files changed (1) hide show

README.md +132 -0

README.md ADDED Viewed

	@@ -0,0 +1,132 @@

+---
+license: mit
+train: false
+inference: true
+pipeline_tag: text-generation
+base_model:
+- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+---
+<br><img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/FHc3IG1KAJn6N3s1TJLrS.webp" width="720"><br>
+# Llama.cpp imatrix quantizations of [mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1)
+Using llama.cpp commit [3ad5451](https://github.com/ggerganov/llama.cpp/commit/3ad5451) for quantization.
+All quants were made using the imatrix option and Bartowski's [calibration file](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8).
+<hr>
+# Perplexity table (the lower the better)
+| Quant                                                                                                                                                  | Size (MB) | PPL     | Size (%) | Accuracy (%) | PPL error rate |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------ | --------- | ------- | -------- | ------------ | -------------- |
+| [IQ1_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ1_S.gguf)     | 489       | 88.4250 | 14.40    | 23.35        | 1.76           |
+| [IQ1_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ1_M.gguf)     | 516       | 53.8278 | 15.19    | 38.35        | 1.03           |
+| [IQ2_XXS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_XXS.gguf) | 560       | 45.5693 | 16.49    | 45.31        | 0.93           |
+| [IQ2_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_XS.gguf)   | 598       | 32.6813 | 17.61    | 63.17        | 0.62           |
+| [IQ2_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_S.gguf)     | 633       | 28.5477 | 18.64    | 72.32        | 0.54           |
+| [IQ2_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ2_M.gguf)     | 669       | 31.8272 | 19.70    | 64.87        | 0.63           |
+| [Q2_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q2_K_S.gguf)   | 683       | 28.7707 | 20.11    | 71.76        | 0.54           |
+| [Q2_K](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q2_K.gguf)       | 718       | 27.6342 | 21.14    | 74.71        | 0.51           |
+| [IQ3_XXS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_XXS.gguf) | 733       | 23.5511 | 21.58    | 87.66        | 0.44           |
+| [IQ3_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_XS.gguf)   | 793       | 22.9887 | 23.35    | 89.81        | 0.42           |
+| [Q3_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_S.gguf)   | 821       | 28.0462 | 24.17    | 73.61        | 0.53           |
+| [IQ3_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_S.gguf)     | 822       | 22.9268 | 24.20    | 90.05        | 0.42           |
+| [IQ3_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ3_M.gguf)     | 836       | 22.3167 | 24.62    | 92.51        | 0.41           |
+| [Q3_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_M.gguf)   | 881       | 22.5727 | 25.94    | 91.46        | 0.41           |
+| [Q3_K_L](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q3_K_L.gguf)   | 935       | 22.3758 | 27.53    | 92.27        | 0.41           |
+| [IQ4_XS](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ4_XS.gguf)   | 972       | 21.3273 | 28.62    | 96.80        | 0.38           |
+| [IQ4_NL](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-IQ4_NL.gguf)   | 1018      | 21.3234 | 29.98    | 96.82        | 0.38           |
+| [Q4_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_0.gguf)       | 1019      | 22.5210 | 30.00    | 91.67        | 0.41           |
+| [Q4_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_K_S.gguf)   | 1022      | 21.1717 | 30.09    | 97.51        | 0.38           |
+| [Q4_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_K_M.gguf)   | 1065      | 21.0532 | 31.36    | 98.06        | 0.38           |
+| [Q4_1](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q4_1.gguf)       | 1109      | 21.1492 | 32.66    | 97.62        | 0.38           |
+| [Q5_K_S](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_K_S.gguf)   | 1201      | 20.7883 | 35.37    | 99.31        | 0.37           |
+| [Q5_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_0.gguf)       | 1203      | 20.8643 | 35.42    | 98.95        | 0.37           |
+| [Q5_K_M](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_K_M.gguf)   | 1226      | 20.7488 | 36.10    | 99.50        | 0.37           |
+| [Q5_1](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q5_1.gguf)       | 1293      | 20.7773 | 38.07    | 99.37        | 0.37           |
+| [Q6_K](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q6_K.gguf)       | 1396      | 20.6994 | 41.11    | 99.74        | 0.37           |
+| [Q8_0](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-Q8_0.gguf)       | 1807      | 20.6659 | 53.21    | 99.90        | 0.37           |
+| [F16](https://huggingface.co/ThomasBaruzier/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-GGUF/blob/main/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1-F16.gguf)         | 3396      | 20.6457 | 100      | 100          | 0.37           |
+<hr>
+---
+license: mit
+train: false
+inference: true
+pipeline_tag: text-generation
+base_model:
+- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
+---
+This is a version of the <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> model re-distilled for better performance.
+## Performance
+| Models            | <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> | <a href="https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1">DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1</a> |
+|:-------------------:|:--------:|:----------------:|
+| ARC (25-shot)      | 40.96 | <b>41.55</b>  |
+| HellaSwag (10-shot)| 44    | <b>45.88</b> |
+| MMLU (5-shot)      | 39.27 | <b>41.82</b> |
+| TruthfulQA-MC2     | 45.17 | <b>46.63</b> |
+| Winogrande (5-shot)| 55.49 | <b>57.7</b> |
+| GSM8K (5-shot)     | 69.9  | <b>74.3</b> |
+| Average            | 49.13 | <b>51.31</b> |
+| Models            | <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B">DeepSeek-R1-Distill-Qwen-1.5B</a> | <a href="https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1">DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1</a>  |
+|:-------------------:|:--------:|:----------------:|
+| GPQA (0-shot)     | 26.96 | <b>26.99</b>  |
+| MMLU PRO (5-shot) | 16.74 | <b>19.86</b> |
+| MUSR (0-shot)     | 35.93 | <b>36.6</b> |
+| BBH (3-shot)      | 35.12 | <b>37.23</b> |
+| IfEval (0-shot)   | 24.94 | <b>27.22</b> |
+## Usage
+```Python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+compute_dtype = torch.bfloat16
+device   = 'cuda'
+model_id = "mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1"
+model     = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype, attn_implementation="sdpa", device_map=device)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+prompt  = "What is 1.5+102.2?"
+chat    = tokenizer.apply_chat_template([{"role":"user", "content":prompt}], tokenize=True, add_generation_prompt=True, return_tensors="pt")
+outputs = model.generate(chat.to(device), max_new_tokens=1024, do_sample=True)
+print(tokenizer.decode(outputs[0]))
+```
+Output:
+```
+<｜begin▁of▁sentence｜><｜User｜>What is 1.5+102.2?<｜Assistant｜><think>
+First, I identify the numbers involved in the addition: 1.5 and 102.2.
+Next, I add the whole numbers: 1 + 102 equals 103.
+Then, I add the decimal parts: 0.5 + 0.2 equals 0.7.
+Finally, I combine the results: 103 + 0.7 equals 103.7.
+</think>
+To solve the addition \(1.5 + 102.2\), follow these steps:
+1. **Add the whole numbers:**
+   \[
+   1 + 102 = 103
+   \]
+2. **Add the decimal parts:**
+   \[
+   0.5 + 0.2 = 0.7
+   \]
+3. **Combine the results:**
+   \[
+   103 + 0.7 = 103.7
+   \]
+So, the final answer is \(\boxed{103.7}\).<｜end▁of▁sentence｜>
+```