Triangle104 commited on
Commit
1ec6254
·
verified ·
1 Parent(s): bde437b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -16,6 +16,28 @@ tags:
16
  This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Use with llama.cpp
20
  Install llama.cpp through brew (works on Mac and Linux)
21
 
 
16
  This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
18
 
19
+ ---
20
+ QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
21
+
22
+ This repo contains the QwQ 32B model, which has the following features:
23
+
24
+ -Type: Causal Language Models
25
+
26
+ -Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
27
+
28
+ -Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
29
+
30
+ -Number of Parameters: 32.5B
31
+
32
+ -Number of Paramaters (Non-Embedding): 31.0B
33
+
34
+ -Number of Layers: 64
35
+
36
+ -Number of Attention Heads (GQA): 40 for Q and 8 for KV
37
+
38
+ -Context Length: Full 131,072 tokens
39
+
40
+ ---
41
  ## Use with llama.cpp
42
  Install llama.cpp through brew (works on Mac and Linux)
43