keisawada commited on
Commit
0cb1b32
·
verified ·
1 Parent(s): 2d61857

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -26,6 +26,14 @@ By leveraging the advanced instruction-following capability derived from [rinna/
26
  demonstrating performance comparable to a reasoning model on Japanese MT-Bench—**without** requiring additional reasoning processes.
27
  It follows the Qwen2.5 chat format.
28
 
 
 
 
 
 
 
 
 
29
  * **Model architecture**
30
 
31
  A 64-layer, 5120-hidden-size transformer-based language model. For a comprehensive understanding of the architecture, please refer to the [Qwen2.5 Technical Report](https://arxiv.org/abs/2412.15115).
@@ -49,6 +57,10 @@ It follows the Qwen2.5 chat format.
49
  - [Toshiaki Wakatsuki](https://huggingface.co/t-w)
50
  - [Kei Sawada](https://huggingface.co/keisawada)
51
 
 
 
 
 
52
  ---
53
 
54
  # Benchmarking
@@ -65,7 +77,7 @@ It follows the Qwen2.5 chat format.
65
  | [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) | 76.12 | 8.58 | 8.25
66
  | [rinna/qwq-bakeneko-32b](https://huggingface.co/rinna/qwq-bakeneko-32b) | 78.31 | 8.81 | 8.52
67
 
68
- For detailed benchmarking results, please refer to [rinna's LM benchmark page](https://rinnakk.github.io/research/benchmarks/lm/index.html).
69
 
70
  ---
71
 
 
26
  demonstrating performance comparable to a reasoning model on Japanese MT-Bench—**without** requiring additional reasoning processes.
27
  It follows the Qwen2.5 chat format.
28
 
29
+ | Model Type | Model Name
30
+ | :- | :-
31
+ | Japanese Continual Pre-Training Model | Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b)
32
+ | Instruction-Tuning Model | Qwen2.5 Bakeneko 32B Instruct [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct)[[AWQ]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-awq)[[GGUF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int4)
33
+ | DeepSeek R1 Distill Qwen2.5 Merged Reasoning Model | DeepSeek R1 Distill Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b)[[AWQ]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-awq)[[GGUF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gguf)[[GPTQ int8]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int4)
34
+ | QwQ Merged Reasoning Model | QwQ Bakeneko 32B [[HF]](https://huggingface.co/rinna/qwq-bakeneko-32b)[[AWQ]](https://huggingface.co/rinna/qwq-bakeneko-32b-awq)[[GGUF]](https://huggingface.co/rinna/qwq-bakeneko-32b-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwq-bakeneko-32b-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwq-bakeneko-32b-gptq-int4)
35
+ | QwQ Bakeneko Merged Instruction-Tuning Model | Qwen2.5 Bakeneko 32B Instruct V2 [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2)[[AWQ]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-awq)[[GGUF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gptq-int4)
36
+
37
  * **Model architecture**
38
 
39
  A 64-layer, 5120-hidden-size transformer-based language model. For a comprehensive understanding of the architecture, please refer to the [Qwen2.5 Technical Report](https://arxiv.org/abs/2412.15115).
 
57
  - [Toshiaki Wakatsuki](https://huggingface.co/t-w)
58
  - [Kei Sawada](https://huggingface.co/keisawada)
59
 
60
+ * **Release date**
61
+
62
+ February 19, 2025
63
+
64
  ---
65
 
66
  # Benchmarking
 
77
  | [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) | 76.12 | 8.58 | 8.25
78
  | [rinna/qwq-bakeneko-32b](https://huggingface.co/rinna/qwq-bakeneko-32b) | 78.31 | 8.81 | 8.52
79
 
80
+ For detailed benchmarking results, please refer to [rinna's LM benchmark page (Sheet 20250319)](https://rinnakk.github.io/research/benchmarks/lm/index.html).
81
 
82
  ---
83