File size: 12,591 Bytes
afa61ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 817 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 817 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	0.00000000
2	0.00000000
3	0.00000000
4	0.00000000
5	0.00000000
6	16.66666667
7	14.28571429
8	12.50000000
9	11.11111111
10	10.00000000
11	9.09090909
12	16.66666667
13	15.38461538
14	14.28571429
15	13.33333333
16	18.75000000
17	17.64705882
18	16.66666667
19	15.78947368
20	15.00000000
21	19.04761905
22	18.18181818
23	17.39130435
24	20.83333333
25	20.00000000
26	19.23076923
27	18.51851852
28	17.85714286
29	20.68965517
30	20.00000000
31	19.35483871
32	21.87500000
33	24.24242424
34	23.52941176
35	22.85714286
36	22.22222222
37	24.32432432
38	23.68421053
39	23.07692308
40	25.00000000
41	24.39024390
42	23.80952381
43	23.25581395
44	22.72727273
45	22.22222222
46	23.91304348
47	25.53191489
48	27.08333333
49	26.53061224
50	28.00000000
51	29.41176471
52	28.84615385
53	30.18867925
54	29.62962963
55	30.90909091
56	30.35714286
57	29.82456140
58	29.31034483
59	28.81355932
60	30.00000000
61	29.50819672
62	30.64516129
63	30.15873016
64	29.68750000
65	29.23076923
66	30.30303030
67	29.85074627
68	29.41176471
69	30.43478261
70	30.00000000
71	30.98591549
72	31.94444444
73	31.50684932
74	31.08108108
75	30.66666667
76	30.26315789
77	29.87012987
78	29.48717949
79	29.11392405
80	30.00000000
81	29.62962963
82	30.48780488
83	30.12048193
84	29.76190476
85	29.41176471
86	29.06976744
87	28.73563218
88	28.40909091
89	29.21348315
90	28.88888889
91	29.67032967
92	29.34782609
93	29.03225806
94	28.72340426
95	28.42105263
96	29.16666667
97	29.89690722
98	29.59183673
99	30.30303030
100	30.00000000
101	30.69306931
102	30.39215686
103	30.09708738
104	29.80769231
105	29.52380952
106	29.24528302
107	29.90654206
108	30.55555556
109	30.27522936
110	30.90909091
111	30.63063063
112	31.25000000
113	30.97345133
114	30.70175439
115	30.43478261
116	31.03448276
117	30.76923077
118	30.50847458
119	30.25210084
120	30.00000000
121	30.57851240
122	30.32786885
123	30.89430894
124	30.64516129
125	30.40000000
126	30.95238095
127	30.70866142
128	30.46875000
129	30.23255814
130	30.76923077
131	30.53435115
132	30.30303030
133	30.07518797
134	30.59701493
135	31.11111111
136	30.88235294
137	31.38686131
138	31.88405797
139	32.37410072
140	32.14285714
141	31.91489362
142	31.69014085
143	31.46853147
144	31.94444444
145	31.72413793
146	31.50684932
147	31.29251701
148	31.08108108
149	30.87248322
150	30.66666667
151	31.12582781
152	30.92105263
153	30.71895425
154	30.51948052
155	30.32258065
156	30.76923077
157	30.57324841
158	31.01265823
159	31.44654088
160	31.25000000
161	31.05590062
162	30.86419753
163	30.67484663
164	30.48780488
165	30.30303030
166	30.72289157
167	30.53892216
168	30.35714286
169	30.17751479
170	30.00000000
171	29.82456140
172	29.65116279
173	29.47976879
174	29.88505747
175	29.71428571
176	30.11363636
177	29.94350282
178	29.77528090
179	30.16759777
180	30.00000000
181	29.83425414
182	29.67032967
183	29.50819672
184	29.34782609
185	29.18918919
186	29.03225806
187	28.87700535
188	28.72340426
189	29.10052910
190	28.94736842
191	28.79581152
192	28.64583333
193	28.49740933
194	28.86597938
195	28.71794872
196	28.57142857
197	28.93401015
198	29.29292929
199	29.14572864
200	29.00000000
201	28.85572139
202	29.20792079
203	29.06403941
204	28.92156863
205	29.26829268
206	29.12621359
207	29.46859903
208	29.32692308
209	29.18660287
210	29.04761905
211	28.90995261
212	29.24528302
213	29.10798122
214	28.97196262
215	28.83720930
216	29.16666667
217	29.49308756
218	29.35779817
219	29.22374429
220	29.09090909
221	28.95927602
222	28.82882883
223	28.69955157
224	29.01785714
225	29.33333333
226	29.20353982
227	29.51541850
228	29.38596491
229	29.25764192
230	29.13043478
231	29.00432900
232	28.87931034
233	28.75536481
234	29.05982906
235	28.93617021
236	29.23728814
237	29.53586498
238	29.83193277
239	29.70711297
240	29.58333333
241	29.46058091
242	29.33884298
243	29.21810700
244	29.09836066
245	28.97959184
246	28.86178862
247	29.14979757
248	29.03225806
249	28.91566265
250	28.80000000
251	29.08366534
252	28.96825397
253	28.85375494
254	29.13385827
255	29.01960784
256	28.90625000
257	28.79377432
258	28.68217054
259	28.57142857
260	28.46153846
261	28.73563218
262	28.62595420
263	28.51711027
264	28.78787879
265	29.05660377
266	28.94736842
267	29.21348315
268	29.47761194
269	29.36802974
270	29.25925926
271	29.15129151
272	29.04411765
273	29.30402930
274	29.19708029
275	29.09090909
276	28.98550725
277	28.88086643
278	28.77697842
279	28.67383513
280	28.92857143
281	29.18149466
282	29.07801418
283	28.97526502
284	28.87323944
285	28.77192982
286	29.02097902
287	29.26829268
288	29.51388889
289	29.41176471
290	29.31034483
291	29.55326460
292	29.45205479
293	29.35153584
294	29.25170068
295	29.49152542
296	29.72972973
297	29.62962963
298	29.53020134
299	29.43143813
300	29.33333333
301	29.23588040
302	29.13907285
303	29.37293729
304	29.27631579
305	29.18032787
306	29.08496732
307	28.99022801
308	29.22077922
309	29.12621359
310	29.03225806
311	29.26045016
312	29.48717949
313	29.39297125
314	29.29936306
315	29.20634921
316	29.11392405
317	29.02208202
318	28.93081761
319	28.84012539
320	28.75000000
321	28.66043614
322	28.88198758
323	29.10216718
324	29.01234568
325	28.92307692
326	28.83435583
327	29.05198777
328	29.26829268
329	29.17933131
330	29.39393939
331	29.30513595
332	29.51807229
333	29.42942943
334	29.34131737
335	29.55223881
336	29.46428571
337	29.37685460
338	29.28994083
339	29.49852507
340	29.41176471
341	29.32551320
342	29.53216374
343	29.44606414
344	29.65116279
345	29.85507246
346	29.76878613
347	29.97118156
348	29.88505747
349	29.79942693
350	29.71428571
351	29.62962963
352	29.54545455
353	29.46175637
354	29.37853107
355	29.57746479
356	29.49438202
357	29.69187675
358	29.60893855
359	29.52646240
360	29.44444444
361	29.63988920
362	29.55801105
363	29.75206612
364	29.67032967
365	29.58904110
366	29.50819672
367	29.70027248
368	29.61956522
369	29.53929539
370	29.45945946
371	29.38005391
372	29.56989247
373	29.49061662
374	29.67914439
375	29.60000000
376	29.52127660
377	29.70822281
378	29.62962963
379	29.55145119
380	29.73684211
381	29.65879265
382	29.58115183
383	29.76501305
384	29.68750000
385	29.61038961
386	29.53367876
387	29.45736434
388	29.38144330
389	29.30591260
390	29.23076923
391	29.41176471
392	29.33673469
393	29.51653944
394	29.69543147
395	29.62025316
396	29.54545455
397	29.72292191
398	29.64824121
399	29.57393484
400	29.50000000
401	29.42643392
402	29.60199005
403	29.77667494
404	29.70297030
405	29.62962963
406	29.55665025
407	29.48402948
408	29.41176471
409	29.33985330
410	29.26829268
411	29.19708029
412	29.36893204
413	29.53995157
414	29.71014493
415	29.63855422
416	29.56730769
417	29.49640288
418	29.42583732
419	29.35560859
420	29.52380952
421	29.69121140
422	29.62085308
423	29.55082742
424	29.48113208
425	29.41176471
426	29.34272300
427	29.27400468
428	29.20560748
429	29.13752914
430	29.06976744
431	29.00232019
432	29.16666667
433	29.33025404
434	29.49308756
435	29.42528736
436	29.35779817
437	29.29061785
438	29.22374429
439	29.15717540
440	29.31818182
441	29.25170068
442	29.18552036
443	29.11963883
444	29.27927928
445	29.21348315
446	29.14798206
447	29.08277405
448	29.01785714
449	29.17594655
450	29.33333333
451	29.26829268
452	29.20353982
453	29.35982340
454	29.29515419
455	29.23076923
456	29.16666667
457	29.32166302
458	29.25764192
459	29.19389978
460	29.13043478
461	29.06724512
462	29.22077922
463	29.37365011
464	29.52586207
465	29.67741935
466	29.61373391
467	29.55032120
468	29.70085470
469	29.63752665
470	29.57446809
471	29.72399151
472	29.66101695
473	29.59830867
474	29.53586498
475	29.68421053
476	29.62184874
477	29.76939203
478	29.70711297
479	29.64509395
480	29.58333333
481	29.52182952
482	29.66804979
483	29.60662526
484	29.75206612
485	29.89690722
486	29.83539095
487	29.97946612
488	29.91803279
489	30.06134969
490	30.20408163
491	30.34623218
492	30.48780488
493	30.42596349
494	30.56680162
495	30.70707071
496	30.64516129
497	30.58350101
498	30.52208835
499	30.46092184
500	30.40000000
501	30.53892216
502	30.47808765
503	30.41749503
504	30.55555556
505	30.49504950
506	30.43478261
507	30.57199211
508	30.51181102
509	30.64833006
510	30.78431373
511	30.91976517
512	30.85937500
513	30.79922027
514	30.73929961
515	30.87378641
516	30.81395349
517	30.94777563
518	30.88803089
519	30.82851638
520	30.76923077
521	30.90211132
522	30.84291188
523	30.78393881
524	30.72519084
525	30.85714286
526	30.79847909
527	30.74003795
528	30.68181818
529	30.81285444
530	30.75471698
531	30.88512241
532	31.01503759
533	31.14446529
534	31.08614232
535	31.02803738
536	31.15671642
537	31.28491620
538	31.22676580
539	31.16883117
540	31.29629630
541	31.23844732
542	31.18081181
543	31.12338858
544	31.06617647
545	31.00917431
546	30.95238095
547	31.07861060
548	31.20437956
549	31.14754098
550	31.09090909
551	31.03448276
552	31.15942029
553	31.28390597
554	31.22743682
555	31.35135135
556	31.47482014
557	31.41831239
558	31.36200717
559	31.30590340
560	31.42857143
561	31.37254902
562	31.31672598
563	31.26110124
564	31.38297872
565	31.50442478
566	31.62544170
567	31.56966490
568	31.69014085
569	31.81019332
570	31.75438596
571	31.87390543
572	31.81818182
573	31.76265271
574	31.88153310
575	32.00000000
576	32.11805556
577	32.23570191
578	32.35294118
579	32.46977547
580	32.41379310
581	32.35800344
582	32.47422680
583	32.41852487
584	32.53424658
585	32.47863248
586	32.42320819
587	32.36797274
588	32.31292517
589	32.25806452
590	32.20338983
591	32.14890017
592	32.09459459
593	32.04047218
594	32.15488215
595	32.10084034
596	32.04697987
597	32.16080402
598	32.10702341
599	32.05342237
600	32.00000000
601	32.11314476
602	32.22591362
603	32.33830846
604	32.28476821
605	32.23140496
606	32.34323432
607	32.28995058
608	32.23684211
609	32.34811166
610	32.29508197
611	32.24222586
612	32.35294118
613	32.30016313
614	32.24755700
615	32.19512195
616	32.30519481
617	32.41491086
618	32.36245955
619	32.47172859
620	32.41935484
621	32.36714976
622	32.47588424
623	32.42375602
624	32.37179487
625	32.32000000
626	32.26837061
627	32.21690590
628	32.16560510
629	32.11446741
630	32.06349206
631	32.01267829
632	32.12025316
633	32.06951027
634	32.01892744
635	31.96850394
636	31.91823899
637	31.86813187
638	31.81818182
639	31.76838811
640	31.87500000
641	31.82527301
642	31.77570093
643	31.72628305
644	31.83229814
645	31.93798450
646	31.88854489
647	31.83925811
648	31.79012346
649	31.74114022
650	31.84615385
651	31.95084485
652	32.05521472
653	32.15926493
654	32.11009174
655	32.06106870
656	32.01219512
657	32.11567732
658	32.06686930
659	32.16995448
660	32.12121212
661	32.07261725
662	32.02416918
663	31.97586727
664	31.92771084
665	32.03007519
666	32.13213213
667	32.08395802
668	32.03592814
669	32.13751868
670	32.23880597
671	32.19076006
672	32.14285714
673	32.09509658
674	32.04747774
675	32.14814815
676	32.24852071
677	32.20088626
678	32.15339233
679	32.10603829
680	32.05882353
681	32.15859031
682	32.11143695
683	32.06442167
684	32.16374269
685	32.11678832
686	32.06997085
687	32.02328967
688	32.12209302
689	32.22060958
690	32.17391304
691	32.27206946
692	32.22543353
693	32.32323232
694	32.27665706
695	32.23021583
696	32.32758621
697	32.28120516
698	32.23495702
699	32.18884120
700	32.28571429
701	32.23965763
702	32.19373219
703	32.14793741
704	32.10227273
705	32.05673759
706	32.15297450
707	32.24893918
708	32.20338983
709	32.15796897
710	32.11267606
711	32.06751055
712	32.02247191
713	31.97755961
714	31.93277311
715	31.88811189
716	31.84357542
717	31.79916318
718	31.75487465
719	31.84979138
720	31.80555556
721	31.76144244
722	31.71745152
723	31.67358230
724	31.62983425
725	31.58620690
726	31.68044077
727	31.63686382
728	31.59340659
729	31.68724280
730	31.78082192
731	31.73734610
732	31.83060109
733	31.78717599
734	31.74386921
735	31.70068027
736	31.65760870
737	31.75033921
738	31.70731707
739	31.79972936
740	31.75675676
741	31.71390013
742	31.67115903
743	31.62853297
744	31.72043011
745	31.67785235
746	31.76943700
747	31.86077644
748	31.81818182
749	31.90921228
750	32.00000000

Final result: 32.0000 +/- 1.7045
Random chance: 19.8992 +/- 1.4588