File size: 12,598 Bytes
6583e65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 817 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 817 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	25.00000000
5	20.00000000
6	16.66666667
7	14.28571429
8	25.00000000
9	22.22222222
10	20.00000000
11	18.18181818
12	25.00000000
13	30.76923077
14	28.57142857
15	26.66666667
16	31.25000000
17	29.41176471
18	27.77777778
19	26.31578947
20	25.00000000
21	23.80952381
22	22.72727273
23	21.73913043
24	25.00000000
25	24.00000000
26	23.07692308
27	22.22222222
28	21.42857143
29	24.13793103
30	23.33333333
31	22.58064516
32	25.00000000
33	27.27272727
34	26.47058824
35	25.71428571
36	25.00000000
37	27.02702703
38	26.31578947
39	28.20512821
40	27.50000000
41	26.82926829
42	26.19047619
43	27.90697674
44	27.27272727
45	28.88888889
46	28.26086957
47	29.78723404
48	31.25000000
49	30.61224490
50	32.00000000
51	31.37254902
52	30.76923077
53	30.18867925
54	29.62962963
55	30.90909091
56	30.35714286
57	29.82456140
58	29.31034483
59	30.50847458
60	31.66666667
61	31.14754098
62	32.25806452
63	31.74603175
64	31.25000000
65	30.76923077
66	31.81818182
67	31.34328358
68	30.88235294
69	31.88405797
70	31.42857143
71	30.98591549
72	31.94444444
73	31.50684932
74	31.08108108
75	30.66666667
76	31.57894737
77	31.16883117
78	30.76923077
79	30.37974684
80	31.25000000
81	30.86419753
82	31.70731707
83	31.32530120
84	30.95238095
85	30.58823529
86	30.23255814
87	29.88505747
88	29.54545455
89	30.33707865
90	30.00000000
91	29.67032967
92	29.34782609
93	29.03225806
94	28.72340426
95	29.47368421
96	30.20833333
97	30.92783505
98	30.61224490
99	31.31313131
100	32.00000000
101	32.67326733
102	32.35294118
103	32.03883495
104	31.73076923
105	31.42857143
106	31.13207547
107	31.77570093
108	32.40740741
109	32.11009174
110	32.72727273
111	32.43243243
112	33.03571429
113	33.62831858
114	34.21052632
115	33.91304348
116	33.62068966
117	33.33333333
118	33.05084746
119	32.77310924
120	33.33333333
121	33.88429752
122	33.60655738
123	33.33333333
124	33.87096774
125	34.40000000
126	34.92063492
127	34.64566929
128	34.37500000
129	34.88372093
130	35.38461538
131	35.11450382
132	34.84848485
133	34.58646617
134	34.32835821
135	34.81481481
136	34.55882353
137	35.03649635
138	35.50724638
139	35.97122302
140	35.71428571
141	35.46099291
142	35.21126761
143	34.96503497
144	35.41666667
145	35.17241379
146	34.93150685
147	35.37414966
148	35.13513514
149	34.89932886
150	34.66666667
151	35.09933775
152	34.86842105
153	34.64052288
154	34.41558442
155	34.83870968
156	35.25641026
157	35.03184713
158	35.44303797
159	35.22012579
160	35.00000000
161	34.78260870
162	35.18518519
163	34.96932515
164	34.75609756
165	35.15151515
166	35.54216867
167	35.32934132
168	35.71428571
169	35.50295858
170	35.29411765
171	35.08771930
172	34.88372093
173	34.68208092
174	35.05747126
175	34.85714286
176	35.22727273
177	35.02824859
178	34.83146067
179	35.19553073
180	35.55555556
181	35.35911602
182	35.16483516
183	34.97267760
184	34.78260870
185	34.59459459
186	34.94623656
187	34.75935829
188	34.57446809
189	34.92063492
190	34.73684211
191	34.55497382
192	34.37500000
193	34.19689119
194	34.53608247
195	34.35897436
196	34.18367347
197	34.51776650
198	34.84848485
199	34.67336683
200	35.00000000
201	34.82587065
202	35.14851485
203	34.97536946
204	34.80392157
205	34.63414634
206	34.46601942
207	34.78260870
208	34.61538462
209	34.44976077
210	34.28571429
211	34.12322275
212	33.96226415
213	33.80281690
214	33.64485981
215	33.48837209
216	33.79629630
217	34.10138249
218	34.40366972
219	34.24657534
220	34.09090909
221	33.93665158
222	33.78378378
223	33.63228700
224	33.92857143
225	33.77777778
226	34.07079646
227	34.36123348
228	34.21052632
229	34.06113537
230	33.91304348
231	33.76623377
232	33.62068966
233	33.47639485
234	33.76068376
235	33.61702128
236	33.89830508
237	33.75527426
238	34.03361345
239	33.89121339
240	33.75000000
241	33.60995851
242	33.47107438
243	33.74485597
244	33.60655738
245	33.46938776
246	33.33333333
247	33.60323887
248	33.46774194
249	33.73493976
250	33.60000000
251	33.86454183
252	34.12698413
253	33.99209486
254	33.85826772
255	33.72549020
256	33.59375000
257	33.46303502
258	33.33333333
259	33.20463320
260	33.07692308
261	33.33333333
262	33.20610687
263	33.07984791
264	33.33333333
265	33.58490566
266	33.45864662
267	33.33333333
268	33.20895522
269	33.08550186
270	32.96296296
271	32.84132841
272	32.72058824
273	32.96703297
274	32.84671533
275	32.72727273
276	32.60869565
277	32.49097473
278	32.37410072
279	32.25806452
280	32.50000000
281	32.74021352
282	32.62411348
283	32.86219081
284	32.74647887
285	32.63157895
286	32.86713287
287	33.10104530
288	32.98611111
289	32.87197232
290	32.75862069
291	32.98969072
292	32.87671233
293	33.10580205
294	32.99319728
295	33.22033898
296	33.44594595
297	33.67003367
298	33.55704698
299	33.44481605
300	33.33333333
301	33.22259136
302	33.11258278
303	33.00330033
304	32.89473684
305	32.78688525
306	32.67973856
307	32.57328990
308	32.79220779
309	32.68608414
310	32.58064516
311	32.47588424
312	32.69230769
313	32.58785942
314	32.48407643
315	32.38095238
316	32.27848101
317	32.17665615
318	32.07547170
319	31.97492163
320	31.87500000
321	31.77570093
322	31.98757764
323	31.88854489
324	31.79012346
325	31.69230769
326	31.59509202
327	31.49847095
328	31.70731707
329	31.61094225
330	31.81818182
331	31.72205438
332	31.92771084
333	31.83183183
334	31.73652695
335	31.94029851
336	32.14285714
337	32.04747774
338	31.95266272
339	32.15339233
340	32.05882353
341	32.25806452
342	32.45614035
343	32.36151603
344	32.55813953
345	32.75362319
346	32.65895954
347	32.85302594
348	32.75862069
349	32.66475645
350	32.57142857
351	32.47863248
352	32.38636364
353	32.29461756
354	32.20338983
355	32.39436620
356	32.30337079
357	32.21288515
358	32.12290503
359	32.03342618
360	31.94444444
361	32.13296399
362	32.04419890
363	32.23140496
364	32.14285714
365	32.05479452
366	31.96721311
367	32.15258856
368	32.06521739
369	31.97831978
370	31.89189189
371	31.80592992
372	31.98924731
373	31.90348525
374	32.08556150
375	32.00000000
376	31.91489362
377	32.09549072
378	32.01058201
379	31.92612137
380	31.84210526
381	31.75853018
382	31.67539267
383	31.59268930
384	31.51041667
385	31.42857143
386	31.34715026
387	31.26614987
388	31.44329897
389	31.36246787
390	31.28205128
391	31.45780051
392	31.37755102
393	31.55216285
394	31.72588832
395	31.64556962
396	31.81818182
397	31.98992443
398	31.90954774
399	31.82957393
400	31.75000000
401	31.67082294
402	31.84079602
403	31.76178660
404	31.68316832
405	31.60493827
406	31.52709360
407	31.44963145
408	31.37254902
409	31.29584352
410	31.21951220
411	31.14355231
412	31.31067961
413	31.47699758
414	31.64251208
415	31.80722892
416	31.73076923
417	31.65467626
418	31.57894737
419	31.50357995
420	31.66666667
421	31.82897862
422	31.75355450
423	31.91489362
424	31.83962264
425	31.76470588
426	31.69014085
427	31.61592506
428	31.54205607
429	31.46853147
430	31.39534884
431	31.32250580
432	31.48148148
433	31.63972286
434	31.79723502
435	31.72413793
436	31.65137615
437	31.57894737
438	31.73515982
439	31.89066059
440	32.04545455
441	31.97278912
442	31.90045249
443	31.82844244
444	31.98198198
445	31.91011236
446	31.83856502
447	31.76733781
448	31.69642857
449	31.84855234
450	31.77777778
451	31.70731707
452	31.63716814
453	31.78807947
454	31.71806167
455	31.64835165
456	31.57894737
457	31.50984683
458	31.44104803
459	31.37254902
460	31.30434783
461	31.23644252
462	31.38528139
463	31.53347732
464	31.68103448
465	31.82795699
466	31.75965665
467	31.69164882
468	31.83760684
469	31.76972281
470	31.70212766
471	31.84713376
472	31.99152542
473	31.92389006
474	31.85654008
475	32.00000000
476	31.93277311
477	31.86582809
478	31.79916318
479	31.73277662
480	31.66666667
481	31.60083160
482	31.74273859
483	31.67701863
484	31.81818182
485	31.95876289
486	31.89300412
487	31.82751540
488	31.76229508
489	31.90184049
490	32.04081633
491	32.17922607
492	32.31707317
493	32.25152130
494	32.38866397
495	32.52525253
496	32.45967742
497	32.39436620
498	32.32931727
499	32.26452906
500	32.20000000
501	32.13572854
502	32.07171315
503	32.00795229
504	32.14285714
505	32.07920792
506	32.01581028
507	32.14990138
508	32.28346457
509	32.41650295
510	32.54901961
511	32.48532290
512	32.42187500
513	32.35867446
514	32.29571984
515	32.42718447
516	32.36434109
517	32.49516441
518	32.43243243
519	32.36994220
520	32.30769231
521	32.24568138
522	32.18390805
523	32.12237094
524	32.06106870
525	32.00000000
526	31.93916350
527	32.06831120
528	32.00757576
529	32.13610586
530	32.07547170
531	32.01506591
532	31.95488722
533	32.08255159
534	32.02247191
535	31.96261682
536	31.90298507
537	32.02979516
538	31.97026022
539	31.91094620
540	32.03703704
541	31.97781885
542	31.91881919
543	31.86003683
544	31.80147059
545	31.74311927
546	31.68498168
547	31.80987203
548	31.93430657
549	31.87613843
550	31.81818182
551	31.76043557
552	31.88405797
553	32.00723327
554	31.94945848
555	31.89189189
556	31.83453237
557	31.77737882
558	31.72043011
559	31.66368515
560	31.78571429
561	31.72905526
562	31.67259786
563	31.61634103
564	31.73758865
565	31.85840708
566	31.97879859
567	31.92239859
568	32.04225352
569	32.16168717
570	32.10526316
571	32.22416813
572	32.16783217
573	32.11169284
574	32.22996516
575	32.34782609
576	32.46527778
577	32.58232236
578	32.69896194
579	32.81519862
580	32.93103448
581	32.87435456
582	32.98969072
583	32.93310463
584	33.04794521
585	32.99145299
586	33.10580205
587	33.04940375
588	32.99319728
589	32.93718166
590	33.05084746
591	32.99492386
592	32.93918919
593	32.88364250
594	32.99663300
595	32.94117647
596	32.88590604
597	32.83082077
598	32.77591973
599	32.72120200
600	32.66666667
601	32.77870216
602	32.89036545
603	33.00165837
604	32.94701987
605	32.89256198
606	33.00330033
607	32.94892916
608	32.89473684
609	32.84072250
610	32.78688525
611	32.73322422
612	32.84313725
613	32.78955954
614	32.73615635
615	32.68292683
616	32.79220779
617	32.90113452
618	32.84789644
619	32.95638126
620	32.90322581
621	32.85024155
622	32.95819936
623	32.90529695
624	32.85256410
625	32.80000000
626	32.74760383
627	32.69537480
628	32.80254777
629	32.75039746
630	32.69841270
631	32.80507132
632	32.75316456
633	32.70142180
634	32.80757098
635	32.75590551
636	32.70440252
637	32.65306122
638	32.60188088
639	32.55086072
640	32.65625000
641	32.60530421
642	32.71028037
643	32.65940902
644	32.60869565
645	32.71317829
646	32.66253870
647	32.61205564
648	32.71604938
649	32.66563945
650	32.76923077
651	32.87250384
652	32.97546012
653	33.07810107
654	33.02752294
655	32.97709924
656	33.07926829
657	33.18112633
658	33.13069909
659	33.23216995
660	33.18181818
661	33.13161876
662	33.08157100
663	33.03167421
664	32.98192771
665	32.93233083
666	33.03303303
667	32.98350825
668	32.93413174
669	32.88490284
670	32.98507463
671	32.93591654
672	32.88690476
673	32.83803863
674	32.78931751
675	32.88888889
676	32.98816568
677	32.93943870
678	32.89085546
679	32.84241532
680	32.79411765
681	32.74596182
682	32.84457478
683	32.79648609
684	32.89473684
685	32.84671533
686	32.79883382
687	32.75109170
688	32.70348837
689	32.80116110
690	32.75362319
691	32.85094067
692	32.94797688
693	33.04473304
694	32.99711816
695	32.94964029
696	33.04597701
697	32.99856528
698	32.95128940
699	32.90414878
700	33.00000000
701	32.95292439
702	32.90598291
703	32.85917496
704	32.81250000
705	32.76595745
706	32.86118980
707	32.95615276
708	32.90960452
709	32.86318759
710	32.95774648
711	32.91139241
712	33.00561798
713	32.95932679
714	32.91316527
715	33.00699301
716	32.96089385
717	33.05439331
718	33.14763231
719	33.24061196
720	33.19444444
721	33.14840499
722	33.10249307
723	33.05670816
724	33.14917127
725	33.10344828
726	33.19559229
727	33.14993122
728	33.10439560
729	33.19615912
730	33.15068493
731	33.10533516
732	33.19672131
733	33.15143247
734	33.10626703
735	33.06122449
736	33.01630435
737	33.10719132
738	33.06233062
739	33.01759134
740	32.97297297
741	33.06342780
742	33.01886792
743	32.97442799
744	33.06451613
745	33.02013423
746	33.10991957
747	33.19946452
748	33.15508021
749	33.24432577
750	33.33333333

Final result: 33.3333 +/- 1.7225
Random chance: 19.8992 +/- 1.4588