File size: 12,599 Bytes
afa61ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 869 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 869 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	100.00000000
3	66.66666667
4	50.00000000
5	60.00000000
6	66.66666667
7	71.42857143
8	75.00000000
9	66.66666667
10	70.00000000
11	63.63636364
12	66.66666667
13	61.53846154
14	64.28571429
15	66.66666667
16	68.75000000
17	70.58823529
18	66.66666667
19	68.42105263
20	65.00000000
21	66.66666667
22	68.18181818
23	65.21739130
24	66.66666667
25	64.00000000
26	65.38461538
27	66.66666667
28	64.28571429
29	65.51724138
30	66.66666667
31	64.51612903
32	65.62500000
33	63.63636364
34	61.76470588
35	60.00000000
36	58.33333333
37	59.45945946
38	57.89473684
39	58.97435897
40	60.00000000
41	60.97560976
42	59.52380952
43	58.13953488
44	56.81818182
45	57.77777778
46	58.69565217
47	59.57446809
48	58.33333333
49	57.14285714
50	58.00000000
51	58.82352941
52	59.61538462
53	60.37735849
54	61.11111111
55	60.00000000
56	60.71428571
57	61.40350877
58	62.06896552
59	62.71186441
60	63.33333333
61	63.93442623
62	62.90322581
63	63.49206349
64	64.06250000
65	63.07692308
66	63.63636364
67	64.17910448
68	64.70588235
69	65.21739130
70	64.28571429
71	63.38028169
72	62.50000000
73	63.01369863
74	63.51351351
75	64.00000000
76	63.15789474
77	62.33766234
78	62.82051282
79	62.02531646
80	61.25000000
81	60.49382716
82	60.97560976
83	60.24096386
84	60.71428571
85	61.17647059
86	60.46511628
87	59.77011494
88	60.22727273
89	60.67415730
90	61.11111111
91	61.53846154
92	60.86956522
93	60.21505376
94	59.57446809
95	60.00000000
96	60.41666667
97	59.79381443
98	59.18367347
99	59.59595960
100	60.00000000
101	59.40594059
102	59.80392157
103	59.22330097
104	59.61538462
105	60.00000000
106	59.43396226
107	59.81308411
108	60.18518519
109	59.63302752
110	60.00000000
111	60.36036036
112	60.71428571
113	60.17699115
114	60.52631579
115	60.00000000
116	60.34482759
117	60.68376068
118	61.01694915
119	60.50420168
120	60.83333333
121	60.33057851
122	60.65573770
123	60.97560976
124	60.48387097
125	60.80000000
126	61.11111111
127	60.62992126
128	60.93750000
129	60.46511628
130	60.00000000
131	60.30534351
132	60.60606061
133	60.90225564
134	60.44776119
135	60.00000000
136	59.55882353
137	59.85401460
138	60.14492754
139	59.71223022
140	59.28571429
141	58.86524823
142	59.15492958
143	59.44055944
144	59.02777778
145	59.31034483
146	58.90410959
147	59.18367347
148	58.78378378
149	59.06040268
150	59.33333333
151	59.60264901
152	59.86842105
153	59.47712418
154	59.09090909
155	58.70967742
156	58.97435897
157	59.23566879
158	58.86075949
159	59.11949686
160	58.75000000
161	58.38509317
162	58.02469136
163	58.28220859
164	57.92682927
165	58.18181818
166	58.43373494
167	58.08383234
168	58.33333333
169	58.57988166
170	58.23529412
171	58.47953216
172	58.72093023
173	58.95953757
174	59.19540230
175	58.85714286
176	58.52272727
177	58.75706215
178	58.42696629
179	58.10055866
180	57.77777778
181	57.45856354
182	57.69230769
183	57.37704918
184	57.06521739
185	57.29729730
186	57.52688172
187	57.21925134
188	56.91489362
189	56.61375661
190	56.31578947
191	56.54450262
192	56.25000000
193	55.95854922
194	56.18556701
195	56.41025641
196	56.63265306
197	56.85279188
198	56.56565657
199	56.78391960
200	57.00000000
201	57.21393035
202	56.93069307
203	56.65024631
204	56.86274510
205	56.58536585
206	56.31067961
207	56.03864734
208	55.76923077
209	55.50239234
210	55.71428571
211	55.92417062
212	55.66037736
213	55.86854460
214	56.07476636
215	56.27906977
216	56.01851852
217	56.22119816
218	55.96330275
219	55.70776256
220	55.90909091
221	56.10859729
222	55.85585586
223	55.60538117
224	55.80357143
225	55.55555556
226	55.75221239
227	55.94713656
228	56.14035088
229	56.33187773
230	56.52173913
231	56.27705628
232	56.03448276
233	55.79399142
234	55.55555556
235	55.31914894
236	55.08474576
237	54.85232068
238	55.04201681
239	55.23012552
240	55.41666667
241	55.60165975
242	55.37190083
243	55.55555556
244	55.32786885
245	55.51020408
246	55.69105691
247	55.87044534
248	56.04838710
249	55.82329317
250	56.00000000
251	56.17529880
252	56.34920635
253	56.52173913
254	56.29921260
255	56.47058824
256	56.25000000
257	56.03112840
258	55.81395349
259	55.59845560
260	55.38461538
261	55.17241379
262	55.34351145
263	55.51330798
264	55.68181818
265	55.47169811
266	55.26315789
267	55.43071161
268	55.59701493
269	55.39033457
270	55.55555556
271	55.71955720
272	55.51470588
273	55.31135531
274	55.10948905
275	54.90909091
276	55.07246377
277	55.23465704
278	55.03597122
279	55.19713262
280	55.35714286
281	55.51601423
282	55.67375887
283	55.47703180
284	55.63380282
285	55.78947368
286	55.94405594
287	56.09756098
288	56.25000000
289	56.05536332
290	56.20689655
291	56.01374570
292	55.82191781
293	55.97269625
294	55.78231293
295	55.59322034
296	55.74324324
297	55.89225589
298	56.04026846
299	55.85284281
300	56.00000000
301	55.81395349
302	55.96026490
303	56.10561056
304	56.25000000
305	56.39344262
306	56.20915033
307	56.35179153
308	56.49350649
309	56.63430421
310	56.77419355
311	56.91318328
312	56.73076923
313	56.86900958
314	56.68789809
315	56.50793651
316	56.64556962
317	56.78233438
318	56.60377358
319	56.42633229
320	56.25000000
321	56.07476636
322	56.21118012
323	56.03715170
324	55.86419753
325	56.00000000
326	56.13496933
327	55.96330275
328	56.09756098
329	56.23100304
330	56.06060606
331	56.19335347
332	56.32530120
333	56.45645646
334	56.58682635
335	56.41791045
336	56.25000000
337	56.08308605
338	56.21301775
339	56.34218289
340	56.47058824
341	56.30498534
342	56.43274854
343	56.55976676
344	56.68604651
345	56.52173913
346	56.64739884
347	56.48414986
348	56.32183908
349	56.16045845
350	56.28571429
351	56.41025641
352	56.53409091
353	56.37393768
354	56.21468927
355	56.05633803
356	56.17977528
357	56.02240896
358	56.14525140
359	55.98885794
360	56.11111111
361	56.23268698
362	56.35359116
363	56.19834711
364	56.04395604
365	55.89041096
366	55.73770492
367	55.85831063
368	55.97826087
369	55.82655827
370	55.94594595
371	55.79514825
372	55.64516129
373	55.49597855
374	55.61497326
375	55.46666667
376	55.31914894
377	55.43766578
378	55.55555556
379	55.67282322
380	55.78947368
381	55.64304462
382	55.49738220
383	55.35248042
384	55.46875000
385	55.58441558
386	55.69948187
387	55.55555556
388	55.41237113
389	55.26992288
390	55.12820513
391	54.98721228
392	55.10204082
393	54.96183206
394	55.07614213
395	55.18987342
396	55.05050505
397	54.91183879
398	54.77386935
399	54.88721805
400	55.00000000
401	54.86284289
402	54.97512438
403	54.83870968
404	54.95049505
405	55.06172840
406	55.17241379
407	55.28255528
408	55.39215686
409	55.25672372
410	55.12195122
411	55.23114355
412	55.33980583
413	55.20581114
414	55.31400966
415	55.18072289
416	55.04807692
417	54.91606715
418	54.78468900
419	54.65393795
420	54.76190476
421	54.63182898
422	54.73933649
423	54.84633570
424	54.71698113
425	54.58823529
426	54.46009390
427	54.56674473
428	54.43925234
429	54.54545455
430	54.41860465
431	54.29234339
432	54.39814815
433	54.27251732
434	54.37788018
435	54.48275862
436	54.35779817
437	54.23340961
438	54.10958904
439	54.21412301
440	54.09090909
441	54.19501134
442	54.07239819
443	53.95033860
444	53.82882883
445	53.70786517
446	53.81165919
447	53.69127517
448	53.57142857
449	53.67483296
450	53.55555556
451	53.43680710
452	53.31858407
453	53.42163355
454	53.30396476
455	53.18681319
456	53.07017544
457	52.95404814
458	53.05676856
459	53.15904139
460	53.04347826
461	52.92841649
462	53.03030303
463	52.91576674
464	52.80172414
465	52.68817204
466	52.57510730
467	52.67665953
468	52.77777778
469	52.66524520
470	52.55319149
471	52.65392781
472	52.75423729
473	52.64270613
474	52.53164557
475	52.42105263
476	52.31092437
477	52.41090147
478	52.51046025
479	52.60960334
480	52.70833333
481	52.59875260
482	52.69709544
483	52.79503106
484	52.89256198
485	52.98969072
486	53.08641975
487	52.97741273
488	53.07377049
489	52.96523517
490	52.85714286
491	52.95315682
492	52.84552846
493	52.73833671
494	52.83400810
495	52.72727273
496	52.62096774
497	52.71629779
498	52.81124498
499	52.90581162
500	53.00000000
501	53.09381238
502	53.18725100
503	53.08151093
504	52.97619048
505	53.06930693
506	52.96442688
507	52.85996055
508	52.95275591
509	52.84872299
510	52.74509804
511	52.64187867
512	52.73437500
513	52.63157895
514	52.72373541
515	52.81553398
516	52.71317829
517	52.61121857
518	52.70270270
519	52.60115607
520	52.50000000
521	52.59117083
522	52.49042146
523	52.39005736
524	52.29007634
525	52.19047619
526	52.09125475
527	51.99240987
528	51.89393939
529	51.79584121
530	51.88679245
531	51.97740113
532	52.06766917
533	52.15759850
534	52.05992509
535	51.96261682
536	52.05223881
537	52.14152700
538	52.04460967
539	52.13358071
540	52.03703704
541	51.94085028
542	52.02952030
543	51.93370166
544	51.83823529
545	51.92660550
546	52.01465201
547	51.91956124
548	52.00729927
549	52.09471767
550	52.18181818
551	52.08711434
552	51.99275362
553	52.07956600
554	51.98555957
555	52.07207207
556	51.97841727
557	52.06463196
558	51.97132616
559	52.05724508
560	52.14285714
561	52.22816399
562	52.31316726
563	52.39786856
564	52.30496454
565	52.21238938
566	52.29681979
567	52.20458554
568	52.28873239
569	52.19683656
570	52.10526316
571	52.18914186
572	52.09790210
573	52.00698080
574	52.09059233
575	52.00000000
576	52.08333333
577	52.16637782
578	52.24913495
579	52.33160622
580	52.24137931
581	52.15146299
582	52.06185567
583	51.97255575
584	52.05479452
585	52.13675214
586	52.04778157
587	51.95911414
588	52.04081633
589	52.12224109
590	52.20338983
591	52.28426396
592	52.36486486
593	52.27655987
594	52.35690236
595	52.43697479
596	52.34899329
597	52.26130653
598	52.17391304
599	52.08681135
600	52.00000000
601	51.91347754
602	51.82724252
603	51.74129353
604	51.65562914
605	51.57024793
606	51.65016502
607	51.72981878
608	51.80921053
609	51.72413793
610	51.80327869
611	51.71849427
612	51.79738562
613	51.71288744
614	51.62866450
615	51.70731707
616	51.62337662
617	51.70178282
618	51.61812298
619	51.53473344
620	51.61290323
621	51.69082126
622	51.60771704
623	51.68539326
624	51.60256410
625	51.52000000
626	51.43769968
627	51.51515152
628	51.59235669
629	51.51033386
630	51.58730159
631	51.50554675
632	51.42405063
633	51.34281201
634	51.41955836
635	51.49606299
636	51.57232704
637	51.49136578
638	51.56739812
639	51.64319249
640	51.56250000
641	51.63806552
642	51.55763240
643	51.47744946
644	51.55279503
645	51.47286822
646	51.54798762
647	51.46831530
648	51.54320988
649	51.46379045
650	51.53846154
651	51.61290323
652	51.53374233
653	51.60796325
654	51.52905199
655	51.45038168
656	51.37195122
657	51.29375951
658	51.36778116
659	51.28983308
660	51.36363636
661	51.28593041
662	51.35951662
663	51.28205128
664	51.35542169
665	51.42857143
666	51.35135135
667	51.42428786
668	51.34730539
669	51.27055306
670	51.34328358
671	51.41579732
672	51.33928571
673	51.26300149
674	51.18694362
675	51.11111111
676	51.03550296
677	51.10782866
678	51.17994100
679	51.10456554
680	51.17647059
681	51.10132159
682	51.17302053
683	51.09809663
684	51.16959064
685	51.24087591
686	51.16618076
687	51.09170306
688	51.16279070
689	51.23367199
690	51.15942029
691	51.23010130
692	51.30057803
693	51.22655123
694	51.15273775
695	51.07913669
696	51.00574713
697	50.93256815
698	50.85959885
699	50.78683834
700	50.85714286
701	50.78459344
702	50.85470085
703	50.78236131
704	50.85227273
705	50.92198582
706	50.99150142
707	51.06082037
708	51.12994350
709	51.19887165
710	51.26760563
711	51.33614627
712	51.40449438
713	51.33239832
714	51.26050420
715	51.18881119
716	51.11731844
717	51.04602510
718	50.97493036
719	51.04311544
720	51.11111111
721	51.17891817
722	51.24653740
723	51.31396957
724	51.38121547
725	51.44827586
726	51.37741047
727	51.30674003
728	51.37362637
729	51.30315501
730	51.36986301
731	51.43638851
732	51.50273224
733	51.56889495
734	51.63487738
735	51.56462585
736	51.49456522
737	51.42469471
738	51.35501355
739	51.28552097
740	51.21621622
741	51.28205128
742	51.34770889
743	51.41318977
744	51.34408602
745	51.27516779
746	51.20643432
747	51.13788487
748	51.06951872
749	51.00133511
750	51.06666667

Final result: 51.0667 +/- 1.8265
Random chance: 25.0083 +/- 1.5824