File size: 12,600 Bytes
6583e65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	50.00000000
9	44.44444444
10	50.00000000
11	45.45454545
12	41.66666667
13	38.46153846
14	35.71428571
15	40.00000000
16	43.75000000
17	47.05882353
18	44.44444444
19	47.36842105
20	50.00000000
21	52.38095238
22	50.00000000
23	47.82608696
24	50.00000000
25	52.00000000
26	50.00000000
27	51.85185185
28	53.57142857
29	55.17241379
30	56.66666667
31	58.06451613
32	59.37500000
33	57.57575758
34	58.82352941
35	60.00000000
36	61.11111111
37	59.45945946
38	60.52631579
39	58.97435897
40	60.00000000
41	58.53658537
42	57.14285714
43	58.13953488
44	56.81818182
45	55.55555556
46	54.34782609
47	53.19148936
48	54.16666667
49	53.06122449
50	52.00000000
51	50.98039216
52	50.00000000
53	49.05660377
54	48.14814815
55	49.09090909
56	50.00000000
57	50.87719298
58	51.72413793
59	52.54237288
60	51.66666667
61	52.45901639
62	51.61290323
63	50.79365079
64	50.00000000
65	49.23076923
66	50.00000000
67	49.25373134
68	48.52941176
69	47.82608696
70	48.57142857
71	49.29577465
72	50.00000000
73	49.31506849
74	50.00000000
75	50.66666667
76	51.31578947
77	51.94805195
78	52.56410256
79	51.89873418
80	52.50000000
81	53.08641975
82	52.43902439
83	53.01204819
84	53.57142857
85	52.94117647
86	52.32558140
87	52.87356322
88	53.40909091
89	52.80898876
90	52.22222222
91	51.64835165
92	51.08695652
93	50.53763441
94	50.00000000
95	49.47368421
96	50.00000000
97	50.51546392
98	50.00000000
99	50.50505051
100	50.00000000
101	49.50495050
102	49.01960784
103	48.54368932
104	49.03846154
105	49.52380952
106	50.00000000
107	49.53271028
108	49.07407407
109	48.62385321
110	48.18181818
111	48.64864865
112	49.10714286
113	48.67256637
114	49.12280702
115	49.56521739
116	50.00000000
117	50.42735043
118	50.00000000
119	50.42016807
120	50.00000000
121	50.41322314
122	50.00000000
123	49.59349593
124	50.00000000
125	50.40000000
126	50.00000000
127	49.60629921
128	49.21875000
129	48.83720930
130	48.46153846
131	48.85496183
132	49.24242424
133	48.87218045
134	48.50746269
135	48.14814815
136	48.52941176
137	48.17518248
138	47.82608696
139	48.20143885
140	48.57142857
141	48.93617021
142	49.29577465
143	49.65034965
144	49.30555556
145	48.96551724
146	49.31506849
147	48.97959184
148	49.32432432
149	49.66442953
150	49.33333333
151	49.00662252
152	49.34210526
153	49.67320261
154	49.35064935
155	49.03225806
156	48.71794872
157	49.04458599
158	49.36708861
159	49.05660377
160	49.37500000
161	49.06832298
162	48.76543210
163	48.46625767
164	48.17073171
165	48.48484848
166	48.19277108
167	48.50299401
168	48.21428571
169	47.92899408
170	47.64705882
171	47.36842105
172	47.09302326
173	46.82080925
174	46.55172414
175	46.28571429
176	46.59090909
177	46.32768362
178	46.06741573
179	46.36871508
180	46.11111111
181	45.85635359
182	45.60439560
183	45.90163934
184	45.65217391
185	45.94594595
186	46.23655914
187	45.98930481
188	46.27659574
189	46.56084656
190	46.31578947
191	46.59685864
192	46.87500000
193	46.63212435
194	46.90721649
195	47.17948718
196	47.44897959
197	47.20812183
198	47.47474747
199	47.23618090
200	47.00000000
201	46.76616915
202	46.53465347
203	46.30541872
204	46.07843137
205	46.34146341
206	46.11650485
207	46.37681159
208	46.15384615
209	45.93301435
210	45.71428571
211	45.97156398
212	45.75471698
213	46.00938967
214	45.79439252
215	46.04651163
216	46.29629630
217	46.08294931
218	46.33027523
219	46.57534247
220	46.81818182
221	46.60633484
222	46.39639640
223	46.63677130
224	46.42857143
225	46.66666667
226	46.46017699
227	46.25550661
228	46.49122807
229	46.28820961
230	46.08695652
231	45.88744589
232	45.68965517
233	45.49356223
234	45.72649573
235	45.53191489
236	45.33898305
237	45.14767932
238	44.95798319
239	44.76987448
240	45.00000000
241	44.81327801
242	45.04132231
243	45.26748971
244	45.08196721
245	45.30612245
246	45.12195122
247	44.93927126
248	45.16129032
249	44.97991968
250	44.80000000
251	44.62151394
252	44.44444444
253	44.26877470
254	44.09448819
255	43.92156863
256	44.14062500
257	44.35797665
258	44.18604651
259	44.01544402
260	44.23076923
261	44.06130268
262	43.89312977
263	43.72623574
264	43.56060606
265	43.77358491
266	43.98496241
267	43.82022472
268	43.65671642
269	43.86617100
270	44.07407407
271	43.91143911
272	43.75000000
273	43.58974359
274	43.43065693
275	43.27272727
276	43.47826087
277	43.68231047
278	43.52517986
279	43.36917563
280	43.21428571
281	43.06049822
282	42.90780142
283	42.75618375
284	42.60563380
285	42.45614035
286	42.30769231
287	42.50871080
288	42.36111111
289	42.21453287
290	42.41379310
291	42.61168385
292	42.80821918
293	42.66211604
294	42.51700680
295	42.37288136
296	42.22972973
297	42.08754209
298	42.28187919
299	42.47491639
300	42.66666667
301	42.52491694
302	42.38410596
303	42.24422442
304	42.10526316
305	42.29508197
306	42.15686275
307	42.34527687
308	42.20779221
309	42.07119741
310	41.93548387
311	41.80064309
312	41.98717949
313	42.17252396
314	42.03821656
315	41.90476190
316	41.77215190
317	41.95583596
318	41.82389937
319	42.00626959
320	41.87500000
321	41.74454829
322	41.92546584
323	41.79566563
324	41.66666667
325	41.84615385
326	41.71779141
327	41.89602446
328	41.76829268
329	41.64133739
330	41.51515152
331	41.38972810
332	41.26506024
333	41.44144144
334	41.31736527
335	41.49253731
336	41.36904762
337	41.24629080
338	41.12426036
339	41.00294985
340	41.17647059
341	41.05571848
342	41.22807018
343	41.10787172
344	40.98837209
345	40.86956522
346	41.04046243
347	41.21037464
348	41.09195402
349	41.26074499
350	41.42857143
351	41.31054131
352	41.47727273
353	41.35977337
354	41.24293785
355	41.12676056
356	41.29213483
357	41.17647059
358	41.06145251
359	40.94707521
360	41.11111111
361	40.99722992
362	41.16022099
363	41.04683196
364	40.93406593
365	40.82191781
366	40.98360656
367	40.87193460
368	40.76086957
369	40.65040650
370	40.81081081
371	40.70080863
372	40.59139785
373	40.48257373
374	40.37433155
375	40.53333333
376	40.42553191
377	40.58355438
378	40.74074074
379	40.63324538
380	40.52631579
381	40.68241470
382	40.83769634
383	40.99216710
384	40.88541667
385	41.03896104
386	40.93264249
387	40.82687339
388	40.72164948
389	40.87403599
390	41.02564103
391	41.17647059
392	41.32653061
393	41.22137405
394	41.11675127
395	41.26582278
396	41.41414141
397	41.30982368
398	41.20603015
399	41.35338346
400	41.25000000
401	41.39650873
402	41.29353234
403	41.19106700
404	41.33663366
405	41.23456790
406	41.13300493
407	41.03194103
408	41.17647059
409	41.32029340
410	41.21951220
411	41.11922141
412	41.01941748
413	40.92009685
414	40.82125604
415	40.72289157
416	40.86538462
417	41.00719424
418	40.90909091
419	41.05011933
420	40.95238095
421	41.09263658
422	40.99526066
423	40.89834515
424	41.03773585
425	40.94117647
426	40.84507042
427	40.74941452
428	40.65420561
429	40.79254079
430	40.69767442
431	40.83526682
432	40.97222222
433	40.87759815
434	40.78341014
435	40.68965517
436	40.59633028
437	40.50343249
438	40.63926941
439	40.77448747
440	40.68181818
441	40.58956916
442	40.49773756
443	40.40632054
444	40.31531532
445	40.22471910
446	40.13452915
447	40.04474273
448	39.95535714
449	39.86636971
450	39.77777778
451	39.68957871
452	39.60176991
453	39.73509934
454	39.64757709
455	39.78021978
456	39.91228070
457	40.04376368
458	39.95633188
459	39.86928105
460	40.00000000
461	39.91323210
462	39.82683983
463	39.95680346
464	40.08620690
465	40.00000000
466	39.91416309
467	40.04282655
468	40.17094017
469	40.29850746
470	40.42553191
471	40.33970276
472	40.25423729
473	40.38054968
474	40.29535865
475	40.21052632
476	40.33613445
477	40.25157233
478	40.37656904
479	40.50104384
480	40.62500000
481	40.54054054
482	40.45643154
483	40.37267081
484	40.28925620
485	40.20618557
486	40.12345679
487	40.24640657
488	40.16393443
489	40.08179959
490	40.20408163
491	40.12219959
492	40.04065041
493	39.95943205
494	40.08097166
495	40.20202020
496	40.32258065
497	40.24144869
498	40.16064257
499	40.08016032
500	40.00000000
501	39.92015968
502	39.84063745
503	39.76143141
504	39.68253968
505	39.60396040
506	39.72332016
507	39.64497041
508	39.76377953
509	39.88212181
510	40.00000000
511	39.92172211
512	39.84375000
513	39.76608187
514	39.68871595
515	39.61165049
516	39.72868217
517	39.84526112
518	39.96138996
519	39.88439306
520	39.80769231
521	39.73128599
522	39.84674330
523	39.77055449
524	39.69465649
525	39.61904762
526	39.54372624
527	39.46869070
528	39.39393939
529	39.50850662
530	39.62264151
531	39.54802260
532	39.47368421
533	39.39962477
534	39.51310861
535	39.62616822
536	39.55223881
537	39.66480447
538	39.77695167
539	39.70315399
540	39.62962963
541	39.55637708
542	39.66789668
543	39.59484346
544	39.70588235
545	39.63302752
546	39.56043956
547	39.48811700
548	39.59854015
549	39.52641166
550	39.45454545
551	39.56442831
552	39.49275362
553	39.42133816
554	39.35018051
555	39.27927928
556	39.38848921
557	39.49730700
558	39.60573477
559	39.53488372
560	39.46428571
561	39.39393939
562	39.50177936
563	39.60923623
564	39.53900709
565	39.64601770
566	39.57597173
567	39.50617284
568	39.61267606
569	39.54305800
570	39.47368421
571	39.40455342
572	39.51048951
573	39.44153578
574	39.37282230
575	39.30434783
576	39.40972222
577	39.51473137
578	39.44636678
579	39.37823834
580	39.48275862
581	39.41480207
582	39.34707904
583	39.45111492
584	39.55479452
585	39.65811966
586	39.76109215
587	39.69335605
588	39.62585034
589	39.55857385
590	39.49152542
591	39.42470389
592	39.35810811
593	39.46037099
594	39.56228956
595	39.49579832
596	39.59731544
597	39.53098827
598	39.46488294
599	39.56594324
600	39.50000000
601	39.43427621
602	39.36877076
603	39.46932007
604	39.40397351
605	39.33884298
606	39.27392739
607	39.37397035
608	39.30921053
609	39.24466338
610	39.18032787
611	39.27986907
612	39.21568627
613	39.31484502
614	39.25081433
615	39.18699187
616	39.12337662
617	39.22204214
618	39.15857605
619	39.09531502
620	39.03225806
621	38.96940419
622	38.90675241
623	38.84430177
624	38.78205128
625	38.72000000
626	38.81789137
627	38.75598086
628	38.69426752
629	38.79173291
630	38.88888889
631	38.98573693
632	39.08227848
633	39.02053712
634	38.95899054
635	38.89763780
636	38.83647799
637	38.77551020
638	38.71473354
639	38.81064163
640	38.90625000
641	38.84555382
642	38.78504673
643	38.72472784
644	38.81987578
645	38.75968992
646	38.69969040
647	38.63987635
648	38.58024691
649	38.52080123
650	38.46153846
651	38.40245776
652	38.49693252
653	38.43797856
654	38.53211009
655	38.47328244
656	38.41463415
657	38.35616438
658	38.44984802
659	38.39150228
660	38.33333333
661	38.27534039
662	38.21752266
663	38.15987934
664	38.10240964
665	38.04511278
666	37.98798799
667	38.08095952
668	38.17365269
669	38.26606876
670	38.35820896
671	38.30104322
672	38.24404762
673	38.18722140
674	38.13056380
675	38.07407407
676	38.16568047
677	38.10930576
678	38.20058997
679	38.14432990
680	38.23529412
681	38.32599119
682	38.26979472
683	38.21376281
684	38.30409357
685	38.24817518
686	38.33819242
687	38.28238719
688	38.22674419
689	38.17126270
690	38.11594203
691	38.20549928
692	38.15028902
693	38.23953824
694	38.18443804
695	38.12949640
696	38.21839080
697	38.16355811
698	38.25214900
699	38.34048641
700	38.28571429
701	38.37375178
702	38.46153846
703	38.54907539
704	38.49431818
705	38.43971631
706	38.38526912
707	38.33097595
708	38.27683616
709	38.22284908
710	38.16901408
711	38.25597750
712	38.34269663
713	38.42917251
714	38.51540616
715	38.46153846
716	38.40782123
717	38.35425384
718	38.30083565
719	38.38664812
720	38.33333333
721	38.28016644
722	38.22714681
723	38.31258645
724	38.25966851
725	38.34482759
726	38.29201102
727	38.23933975
728	38.18681319
729	38.13443073
730	38.08219178
731	38.03009576
732	37.97814208
733	38.06275580
734	38.01089918
735	37.95918367
736	37.90760870
737	37.99185889
738	38.07588076
739	38.15967524
740	38.10810811
741	38.05668016
742	38.14016173
743	38.22341857
744	38.30645161
745	38.25503356
746	38.20375335
747	38.15261044
748	38.10160428
749	38.05073431
750	38.13333333

Final result: 38.1333 +/- 1.7748
Random chance: 25.0000 +/- 1.5822