shimmyshimmer commited on
Commit
0743e79
·
verified ·
1 Parent(s): 1fcabdc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -4
README.md CHANGED
@@ -9,9 +9,13 @@ tags:
9
  - chat
10
  - qwen
11
  ---
 
 
 
 
12
  <div>
13
  <p style="margin-bottom: 0; margin-top: 0;">
14
- <strong>This is Qwen-QwQ-32B with our bug fixes. <br> See <a href="https://huggingface.co/collections/unsloth/qwen-qwq-32b-collection-676b3b29c20c09a8c71a6235">our collection</a> for versions of QwQ-32B with our bug fixes including GGUF & 4-bit formats.</strong>
15
  </p>
16
  <p style="margin-bottom: 0;">
17
  <em>Unsloth's QwQ-32B <a href="https://unsloth.ai/blog/dynamic-4bit">Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 4-bit.</em>
@@ -23,17 +27,51 @@ tags:
23
  <a href="https://discord.gg/unsloth">
24
  <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
25
  </a>
26
- <a href="https://docs.unsloth.ai/">
27
  <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
28
  </a>
29
  </div>
30
  <h1 style="margin-top: 0rem;">Finetune your own Reasoning model like R1 with Unsloth!</h2>
31
  </div>
32
 
33
- We have a free Google Colab notebook for turning Qwen2.5 (3B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(3B)-GRPO.ipynb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
 
 
 
35
 
36
- ## ✨ Finetune for Free
 
 
37
 
38
  All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
39
 
 
9
  - chat
10
  - qwen
11
  ---
12
+ > [!NOTE]
13
+ > To fix endless generations and for instructions on how to run QwQ-32B, view our [Tutorial here](https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively).
14
+ >
15
+
16
  <div>
17
  <p style="margin-bottom: 0; margin-top: 0;">
18
+ <strong>Qwen-QwQ-32B with our bug fixes. <br> See <a href="https://huggingface.co/collections/unsloth/qwen-qwq-32b-collection-676b3b29c20c09a8c71a6235">our collection</a> for versions of QwQ-32B with our bug fixes including GGUF & 4-bit formats.</strong>
19
  </p>
20
  <p style="margin-bottom: 0;">
21
  <em>Unsloth's QwQ-32B <a href="https://unsloth.ai/blog/dynamic-4bit">Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 4-bit.</em>
 
27
  <a href="https://discord.gg/unsloth">
28
  <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
29
  </a>
30
+ <a href="https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively">
31
  <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
32
  </a>
33
  </div>
34
  <h1 style="margin-top: 0rem;">Finetune your own Reasoning model like R1 with Unsloth!</h2>
35
  </div>
36
 
37
+ To run this model, try:
38
+ ```python
39
+ import os
40
+ os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
41
+ from huggingface_hub import snapshot_download
42
+ snapshot_download(
43
+ repo_id = "unsloth/QwQ-32B-GGUF",
44
+ local_dir = "unsloth-QwQ-32B-GGUF",
45
+ allow_patterns = ["*Q4_K_M*"], # For Q4_K_M
46
+ )
47
+ ```
48
+ ```bash
49
+ ./llama.cpp/llama-cli \
50
+ --model unsloth-QwQ-32B-GGUF/QwQ-32B-Q4_K_M.gguf \
51
+ --threads 32 \
52
+ --ctx-size 16384 \
53
+ --n-gpu-layers 99 \
54
+ --seed 3407 \
55
+ --prio 2 \
56
+ --temp 0.6 \
57
+ --repeat-penalty 1.1 \
58
+ --dry-multiplier 0.5 \
59
+ --min-p 0.1 \
60
+ --top-k 40 \
61
+ --top-p 0.95 \
62
+ -no-cnv \
63
+ --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc" \
64
+ --prompt "<|im_start|>user\nCreate a Flappy Bird game in Python."
65
+ ```
66
+ See https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-without-bugs for more details!
67
 
68
+ > [!NOTE]
69
+ > To stop infinite generations - add `--samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"`
70
+ >
71
 
72
+ # ✨ Finetune for Free
73
+
74
+ We have a free Google Colab notebook for turning Qwen2.5 (3B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(3B)-GRPO.ipynb
75
 
76
  All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
77