Update README.md
Browse files
README.md
CHANGED
@@ -126,6 +126,22 @@ model-index:
|
|
126 |
source:
|
127 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ
|
128 |
name: Open LLM Leaderboard
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
---
|
130 |
Renamed to parm-2
|
131 |
Please note, the low IFEVAL results is due to this model always reasoning, instruction following is limited, which caused it to have very low ifeval results, this should not matter for most use cases.
|
@@ -179,6 +195,8 @@ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-
|
|
179 |
|GPQA (0-shot) | 19.02|
|
180 |
|MuSR (0-shot) | 21.79|
|
181 |
|MMLU-PRO (5-shot) | 47.43|
|
|
|
|
|
182 |
|
183 |
# 🧀 Examples:
|
184 |
(q4_k_m, 10GB rtx 3080, 64GB memory, running inside of MSTY, all use "You are a friendly ai assistant." as the System prompt.)
|
|
|
126 |
source:
|
127 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ
|
128 |
name: Open LLM Leaderboard
|
129 |
+
- task:
|
130 |
+
type: text-generation
|
131 |
+
name: Text Generation
|
132 |
+
dataset:
|
133 |
+
name: Llmexplorer lmsys elo
|
134 |
+
type: elo-score
|
135 |
+
config: main
|
136 |
+
split: test
|
137 |
+
metrics:
|
138 |
+
- type: elo
|
139 |
+
value: 1203
|
140 |
+
name: elo
|
141 |
+
source:
|
142 |
+
url: https://llm.extractum.io/list/?benchmark=score_elo
|
143 |
+
name: LLMexplorer lmsys elo score
|
144 |
+
|
145 |
---
|
146 |
Renamed to parm-2
|
147 |
Please note, the low IFEVAL results is due to this model always reasoning, instruction following is limited, which caused it to have very low ifeval results, this should not matter for most use cases.
|
|
|
195 |
|GPQA (0-shot) | 19.02|
|
196 |
|MuSR (0-shot) | 21.79|
|
197 |
|MMLU-PRO (5-shot) | 47.43|
|
198 |
+
# other leaderboard
|
199 |
+
According to https://llm.extractum.io/list/?benchmark=score_elo, this model is in the top 20 on their LMSys ELO score leaderboard.
|
200 |
|
201 |
# 🧀 Examples:
|
202 |
(q4_k_m, 10GB rtx 3080, 64GB memory, running inside of MSTY, all use "You are a friendly ai assistant." as the System prompt.)
|