Wandb Run: https://wandb.ai/eleutherai/pythia-rlhf/runs/kj29wswk
Eval Results:
| Task | Version | Filter | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|
| arc_challenge | Yaml | none | acc | 0.2995 | ± | 0.0134 | 
| none | acc_norm | 0.3251 | ± | 0.0137 | ||
| arc_easy | Yaml | none | acc | 0.6486 | ± | 0.0098 | 
| none | acc_norm | 0.5673 | ± | 0.0102 | ||
| lambada_openai | Yaml | none | perplexity | 4.7801 | ± | 0.1197 | 
| none | acc | 0.6412 | ± | 0.0067 | ||
| logiqa | Yaml | none | acc | 0.2120 | ± | 0.0160 | 
| none | acc_norm | 0.2873 | ± | 0.0177 | ||
| piqa | Yaml | none | acc | 0.7524 | ± | 0.0101 | 
| none | acc_norm | 0.7530 | ± | 0.0101 | ||
| sciq | Yaml | none | acc | 0.8820 | ± | 0.0102 | 
| none | acc_norm | 0.8160 | ± | 0.0123 | ||
| winogrande | Yaml | none | acc | 0.6077 | ± | 0.0137 | 
| wsc | Yaml | none | acc | 0.3654 | ± | 0.0474 | 
