Add Model Evals
Browse files
README.md
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Wandb Run: https://wandb.ai/eleutherai/pythia-rlhf/runs/kj29wswk
|
| 2 |
+
Eval Results:
|
| 3 |
+
| Task |Version|Filter| Metric |Value | |Stderr|
|
| 4 |
+
|--------------|-------|------|----------|-----:|---|-----:|
|
| 5 |
+
|arc_challenge |Yaml |none |acc |0.2995|± |0.0134|
|
| 6 |
+
| | |none |acc_norm |0.3251|± |0.0137|
|
| 7 |
+
|arc_easy |Yaml |none |acc |0.6486|± |0.0098|
|
| 8 |
+
| | |none |acc_norm |0.5673|± |0.0102|
|
| 9 |
+
|lambada_openai|Yaml |none |perplexity|4.7801|± |0.1197|
|
| 10 |
+
| | |none |acc |0.6412|± |0.0067|
|
| 11 |
+
|logiqa |Yaml |none |acc |0.2120|± |0.0160|
|
| 12 |
+
| | |none |acc_norm |0.2873|± |0.0177|
|
| 13 |
+
|piqa |Yaml |none |acc |0.7524|± |0.0101|
|
| 14 |
+
| | |none |acc_norm |0.7530|± |0.0101|
|
| 15 |
+
|sciq |Yaml |none |acc |0.8820|± |0.0102|
|
| 16 |
+
| | |none |acc_norm |0.8160|± |0.0123|
|
| 17 |
+
|winogrande |Yaml |none |acc |0.6077|± |0.0137|
|
| 18 |
+
|wsc |Yaml |none |acc |0.3654|± |0.0474|
|