File size: 12,596 Bytes
afa61ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	0.00000000
2	0.00000000
3	0.00000000
4	25.00000000
5	20.00000000
6	16.66666667
7	28.57142857
8	37.50000000
9	33.33333333
10	30.00000000
11	27.27272727
12	25.00000000
13	30.76923077
14	28.57142857
15	26.66666667
16	25.00000000
17	29.41176471
18	27.77777778
19	31.57894737
20	35.00000000
21	33.33333333
22	31.81818182
23	30.43478261
24	29.16666667
25	32.00000000
26	30.76923077
27	33.33333333
28	32.14285714
29	34.48275862
30	36.66666667
31	38.70967742
32	40.62500000
33	39.39393939
34	38.23529412
35	40.00000000
36	38.88888889
37	37.83783784
38	36.84210526
39	35.89743590
40	37.50000000
41	36.58536585
42	35.71428571
43	37.20930233
44	36.36363636
45	35.55555556
46	34.78260870
47	34.04255319
48	35.41666667
49	36.73469388
50	36.00000000
51	35.29411765
52	36.53846154
53	35.84905660
54	35.18518519
55	36.36363636
56	37.50000000
57	38.59649123
58	39.65517241
59	38.98305085
60	38.33333333
61	39.34426230
62	38.70967742
63	38.09523810
64	37.50000000
65	36.92307692
66	37.87878788
67	37.31343284
68	36.76470588
69	36.23188406
70	37.14285714
71	38.02816901
72	37.50000000
73	36.98630137
74	37.83783784
75	38.66666667
76	39.47368421
77	40.25974026
78	41.02564103
79	40.50632911
80	40.00000000
81	40.74074074
82	40.24390244
83	40.96385542
84	40.47619048
85	40.00000000
86	39.53488372
87	40.22988506
88	40.90909091
89	40.44943820
90	40.00000000
91	39.56043956
92	39.13043478
93	38.70967742
94	38.29787234
95	38.94736842
96	39.58333333
97	39.17525773
98	38.77551020
99	38.38383838
100	38.00000000
101	38.61386139
102	38.23529412
103	37.86407767
104	38.46153846
105	39.04761905
106	38.67924528
107	38.31775701
108	38.88888889
109	38.53211009
110	38.18181818
111	37.83783784
112	37.50000000
113	38.05309735
114	38.59649123
115	39.13043478
116	39.65517241
117	39.31623932
118	39.83050847
119	39.49579832
120	40.00000000
121	40.49586777
122	40.16393443
123	39.83739837
124	39.51612903
125	40.00000000
126	39.68253968
127	39.37007874
128	39.06250000
129	38.75968992
130	38.46153846
131	38.16793893
132	38.63636364
133	38.34586466
134	38.05970149
135	37.77777778
136	37.50000000
137	37.22627737
138	36.95652174
139	36.69064748
140	37.14285714
141	37.58865248
142	38.02816901
143	38.46153846
144	38.19444444
145	38.62068966
146	38.35616438
147	38.09523810
148	38.51351351
149	38.25503356
150	38.66666667
151	38.41059603
152	38.15789474
153	37.90849673
154	37.66233766
155	37.41935484
156	37.17948718
157	37.57961783
158	37.34177215
159	37.10691824
160	37.50000000
161	37.26708075
162	37.03703704
163	37.42331288
164	37.80487805
165	38.18181818
166	37.95180723
167	38.32335329
168	38.09523810
169	38.46153846
170	38.23529412
171	38.01169591
172	37.79069767
173	37.57225434
174	37.35632184
175	37.14285714
176	36.93181818
177	36.72316384
178	37.07865169
179	36.87150838
180	36.66666667
181	36.46408840
182	36.26373626
183	36.61202186
184	36.41304348
185	36.75675676
186	37.09677419
187	36.89839572
188	37.23404255
189	37.03703704
190	36.84210526
191	37.17277487
192	36.97916667
193	36.78756477
194	37.11340206
195	37.43589744
196	37.75510204
197	37.56345178
198	37.87878788
199	37.68844221
200	37.50000000
201	37.31343284
202	37.12871287
203	36.94581281
204	36.76470588
205	37.07317073
206	37.37864078
207	37.68115942
208	37.50000000
209	37.32057416
210	37.14285714
211	37.44075829
212	37.26415094
213	37.08920188
214	36.91588785
215	37.20930233
216	37.03703704
217	37.32718894
218	37.15596330
219	37.44292237
220	37.72727273
221	37.55656109
222	37.38738739
223	37.66816143
224	37.50000000
225	37.77777778
226	38.05309735
227	37.88546256
228	38.15789474
229	38.42794760
230	38.26086957
231	38.09523810
232	38.36206897
233	38.62660944
234	38.88888889
235	38.72340426
236	38.55932203
237	38.39662447
238	38.23529412
239	38.07531381
240	37.91666667
241	37.75933610
242	38.01652893
243	38.27160494
244	38.11475410
245	38.36734694
246	38.21138211
247	38.46153846
248	38.70967742
249	38.55421687
250	38.40000000
251	38.24701195
252	38.09523810
253	37.94466403
254	37.79527559
255	37.64705882
256	37.89062500
257	38.13229572
258	37.98449612
259	37.83783784
260	38.07692308
261	37.93103448
262	37.78625954
263	38.02281369
264	38.25757576
265	38.11320755
266	37.96992481
267	37.82771536
268	37.68656716
269	37.91821561
270	37.77777778
271	37.63837638
272	37.86764706
273	37.72893773
274	37.59124088
275	37.45454545
276	37.68115942
277	37.90613718
278	37.76978417
279	37.99283154
280	38.21428571
281	38.07829181
282	37.94326241
283	37.80918728
284	37.67605634
285	37.54385965
286	37.41258741
287	37.28222997
288	37.15277778
289	37.02422145
290	37.24137931
291	37.45704467
292	37.67123288
293	37.88395904
294	37.75510204
295	37.62711864
296	37.50000000
297	37.37373737
298	37.58389262
299	37.45819398
300	37.66666667
301	37.54152824
302	37.74834437
303	37.62376238
304	37.82894737
305	38.03278689
306	37.90849673
307	38.11074919
308	37.98701299
309	37.86407767
310	37.74193548
311	37.94212219
312	38.14102564
313	38.33865815
314	38.21656051
315	38.09523810
316	37.97468354
317	38.17034700
318	38.05031447
319	38.24451411
320	38.12500000
321	38.00623053
322	37.88819876
323	37.77089783
324	37.65432099
325	37.84615385
326	37.73006135
327	37.92048930
328	38.10975610
329	38.29787234
330	38.18181818
331	38.06646526
332	38.25301205
333	38.43843844
334	38.32335329
335	38.50746269
336	38.39285714
337	38.27893175
338	38.16568047
339	38.05309735
340	38.23529412
341	38.41642229
342	38.59649123
343	38.48396501
344	38.37209302
345	38.26086957
346	38.43930636
347	38.61671470
348	38.50574713
349	38.39541547
350	38.28571429
351	38.17663818
352	38.06818182
353	38.24362606
354	38.13559322
355	38.02816901
356	38.20224719
357	38.37535014
358	38.26815642
359	38.16155989
360	38.33333333
361	38.22714681
362	38.39779006
363	38.56749311
364	38.46153846
365	38.35616438
366	38.52459016
367	38.41961853
368	38.31521739
369	38.21138211
370	38.37837838
371	38.27493261
372	38.17204301
373	38.06970509
374	38.23529412
375	38.40000000
376	38.29787234
377	38.46153846
378	38.62433862
379	38.52242744
380	38.42105263
381	38.32020997
382	38.21989529
383	38.38120104
384	38.28125000
385	38.18181818
386	38.08290155
387	37.98449612
388	37.88659794
389	38.04627249
390	38.20512821
391	38.36317136
392	38.52040816
393	38.42239186
394	38.32487310
395	38.48101266
396	38.63636364
397	38.79093199
398	38.69346734
399	38.84711779
400	38.75000000
401	38.90274314
402	39.05472637
403	38.95781638
404	38.86138614
405	39.01234568
406	38.91625616
407	38.82063882
408	38.72549020
409	38.87530562
410	38.78048780
411	38.68613139
412	38.59223301
413	38.49878935
414	38.40579710
415	38.31325301
416	38.46153846
417	38.36930456
418	38.51674641
419	38.66348449
420	38.57142857
421	38.47980998
422	38.38862559
423	38.29787234
424	38.20754717
425	38.11764706
426	38.02816901
427	37.93911007
428	37.85046729
429	37.99533800
430	37.90697674
431	37.81902552
432	37.96296296
433	37.87528868
434	37.78801843
435	37.70114943
436	37.61467890
437	37.75743707
438	37.89954338
439	38.04100228
440	37.95454545
441	38.09523810
442	38.00904977
443	37.92325056
444	37.83783784
445	37.75280899
446	37.66816143
447	37.58389262
448	37.50000000
449	37.41648107
450	37.33333333
451	37.25055432
452	37.16814159
453	37.30684327
454	37.22466960
455	37.14285714
456	37.28070175
457	37.19912473
458	37.11790393
459	37.03703704
460	37.17391304
461	37.31019523
462	37.22943723
463	37.36501080
464	37.50000000
465	37.41935484
466	37.55364807
467	37.68736617
468	37.60683761
469	37.73987207
470	37.65957447
471	37.57961783
472	37.50000000
473	37.63213531
474	37.55274262
475	37.47368421
476	37.60504202
477	37.52620545
478	37.65690377
479	37.57828810
480	37.50000000
481	37.42203742
482	37.34439834
483	37.47412008
484	37.60330579
485	37.52577320
486	37.44855967
487	37.57700205
488	37.50000000
489	37.42331288
490	37.34693878
491	37.27087576
492	37.19512195
493	37.11967546
494	37.24696356
495	37.37373737
496	37.50000000
497	37.42454728
498	37.34939759
499	37.47494990
500	37.60000000
501	37.52495010
502	37.45019920
503	37.37574553
504	37.30158730
505	37.22772277
506	37.35177866
507	37.27810651
508	37.40157480
509	37.32809430
510	37.45098039
511	37.37769080
512	37.30468750
513	37.23196881
514	37.15953307
515	37.28155340
516	37.40310078
517	37.52417795
518	37.64478764
519	37.76493256
520	37.69230769
521	37.61996161
522	37.54789272
523	37.47609943
524	37.59541985
525	37.52380952
526	37.45247148
527	37.38140417
528	37.31060606
529	37.24007561
530	37.35849057
531	37.28813559
532	37.21804511
533	37.33583490
534	37.45318352
535	37.57009346
536	37.50000000
537	37.61638734
538	37.73234201
539	37.66233766
540	37.59259259
541	37.52310536
542	37.63837638
543	37.56906077
544	37.50000000
545	37.43119266
546	37.54578755
547	37.47714808
548	37.59124088
549	37.52276867
550	37.45454545
551	37.56805808
552	37.50000000
553	37.43218807
554	37.36462094
555	37.29729730
556	37.41007194
557	37.52244165
558	37.63440860
559	37.74597496
560	37.85714286
561	37.78966132
562	37.90035587
563	38.01065719
564	37.94326241
565	38.05309735
566	38.16254417
567	38.09523810
568	38.20422535
569	38.13708260
570	38.07017544
571	38.00350263
572	37.93706294
573	37.87085515
574	37.80487805
575	37.73913043
576	37.84722222
577	37.95493934
578	37.88927336
579	37.82383420
580	37.93103448
581	38.03786575
582	37.97250859
583	38.07890223
584	38.18493151
585	38.29059829
586	38.39590444
587	38.33049404
588	38.26530612
589	38.20033956
590	38.13559322
591	38.07106599
592	38.00675676
593	38.11129848
594	38.21548822
595	38.15126050
596	38.25503356
597	38.19095477
598	38.12709030
599	38.23038397
600	38.16666667
601	38.26955075
602	38.20598007
603	38.30845771
604	38.24503311
605	38.18181818
606	38.11881188
607	38.22075783
608	38.15789474
609	38.09523810
610	38.03278689
611	37.97054010
612	37.90849673
613	37.84665579
614	37.78501629
615	37.72357724
616	37.66233766
617	37.76337115
618	37.70226537
619	37.64135703
620	37.58064516
621	37.68115942
622	37.62057878
623	37.56019262
624	37.66025641
625	37.60000000
626	37.69968051
627	37.63955343
628	37.57961783
629	37.67885533
630	37.77777778
631	37.71790808
632	37.81645570
633	37.75671406
634	37.69716088
635	37.63779528
636	37.57861635
637	37.51962323
638	37.46081505
639	37.40219092
640	37.50000000
641	37.44149766
642	37.38317757
643	37.32503888
644	37.42236025
645	37.36434109
646	37.30650155
647	37.24884080
648	37.19135802
649	37.13405239
650	37.07692308
651	37.01996928
652	36.96319018
653	36.90658499
654	37.00305810
655	36.94656489
656	37.04268293
657	36.98630137
658	37.08206687
659	37.17754173
660	37.12121212
661	37.06505295
662	37.16012085
663	37.10407240
664	37.04819277
665	36.99248120
666	36.93693694
667	36.88155922
668	36.97604790
669	37.07025411
670	37.01492537
671	36.95976155
672	36.90476190
673	36.84992571
674	36.79525223
675	36.74074074
676	36.68639053
677	36.63220089
678	36.57817109
679	36.52430044
680	36.61764706
681	36.71071953
682	36.65689150
683	36.60322108
684	36.69590643
685	36.64233577
686	36.58892128
687	36.53566230
688	36.48255814
689	36.57474601
690	36.52173913
691	36.46888567
692	36.41618497
693	36.36363636
694	36.31123919
695	36.25899281
696	36.35057471
697	36.29842181
698	36.24641834
699	36.33762518
700	36.28571429
701	36.37660485
702	36.46723647
703	36.41536273
704	36.36363636
705	36.31205674
706	36.26062323
707	36.20933522
708	36.15819209
709	36.10719323
710	36.05633803
711	36.14627286
712	36.23595506
713	36.32538569
714	36.41456583
715	36.36363636
716	36.45251397
717	36.54114365
718	36.49025070
719	36.57858136
720	36.52777778
721	36.47711512
722	36.42659280
723	36.51452282
724	36.60220994
725	36.68965517
726	36.63911846
727	36.58872077
728	36.67582418
729	36.62551440
730	36.57534247
731	36.52530780
732	36.47540984
733	36.56207367
734	36.64850136
735	36.59863946
736	36.54891304
737	36.63500678
738	36.58536585
739	36.67117727
740	36.62162162
741	36.57219973
742	36.52291105
743	36.60834455
744	36.69354839
745	36.64429530
746	36.59517426
747	36.68005355
748	36.63101604
749	36.58210948
750	36.66666667

Final result: 36.6667 +/- 1.7608
Random chance: 25.0000 +/- 1.5822