Muennighoff's picture
Eval
b220bd1
task,metric,value,err,version
anli_r1,acc,0.328,0.014853842487270334,0
anli_r2,acc,0.316,0.01470919305605713,0
anli_r3,acc,0.3591666666666667,0.013855141559780364,0
arc_challenge,acc,0.3054607508532423,0.0134600804780025,0
arc_challenge,acc_norm,0.3319112627986348,0.013760988200880538,0
arc_easy,acc,0.6422558922558923,0.009835772757343361,0
arc_easy,acc_norm,0.6035353535353535,0.010037412763064529,0
boolq,acc,0.6477064220183486,0.00835476049390613,1
cb,acc,0.26785714285714285,0.05971290310957635,1
cb,f1,0.21294539321104786,,1
copa,acc,0.79,0.040936018074033256,0
hellaswag,acc,0.5201155148376817,0.004985741706385719,0
hellaswag,acc_norm,0.6825333598884684,0.004645393477680675,0
piqa,acc,0.764961915125136,0.009893146688805326,0
piqa,acc_norm,0.7725788900979326,0.009779850767847232,0
rte,acc,0.4657039711191336,0.030025579819366426,0
sciq,acc,0.905,0.009276910103103329,0
sciq,acc_norm,0.88,0.0102813280127474,0
storycloze_2016,acc,0.7365045430251203,0.010187168219156485,0
winogrande,acc,0.5872138910812944,0.013837060648682103,0