Pinkstack commited on
Commit
eb41caa
·
verified ·
1 Parent(s): 6b96a41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -126,6 +126,22 @@ model-index:
126
  source:
127
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ
128
  name: Open LLM Leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
  ---
130
  Renamed to parm-2
131
  Please note, the low IFEVAL results is due to this model always reasoning, instruction following is limited, which caused it to have very low ifeval results, this should not matter for most use cases.
@@ -179,6 +195,8 @@ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-
179
  |GPQA (0-shot) | 19.02|
180
  |MuSR (0-shot) | 21.79|
181
  |MMLU-PRO (5-shot) | 47.43|
 
 
182
 
183
  # 🧀 Examples:
184
  (q4_k_m, 10GB rtx 3080, 64GB memory, running inside of MSTY, all use "You are a friendly ai assistant." as the System prompt.)
 
126
  source:
127
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ
128
  name: Open LLM Leaderboard
129
+ - task:
130
+ type: text-generation
131
+ name: Text Generation
132
+ dataset:
133
+ name: Llmexplorer lmsys elo
134
+ type: elo-score
135
+ config: main
136
+ split: test
137
+ metrics:
138
+ - type: elo
139
+ value: 1203
140
+ name: elo
141
+ source:
142
+ url: https://llm.extractum.io/list/?benchmark=score_elo
143
+ name: LLMexplorer lmsys elo score
144
+
145
  ---
146
  Renamed to parm-2
147
  Please note, the low IFEVAL results is due to this model always reasoning, instruction following is limited, which caused it to have very low ifeval results, this should not matter for most use cases.
 
195
  |GPQA (0-shot) | 19.02|
196
  |MuSR (0-shot) | 21.79|
197
  |MMLU-PRO (5-shot) | 47.43|
198
+ # other leaderboard
199
+ According to https://llm.extractum.io/list/?benchmark=score_elo, this model is in the top 20 on their LMSys ELO score leaderboard.
200
 
201
  # 🧀 Examples:
202
  (q4_k_m, 10GB rtx 3080, 64GB memory, running inside of MSTY, all use "You are a friendly ai assistant." as the System prompt.)