====== Perplexity statistics ====== | |
Mean PPL(Q) : 16.876273 ± 0.145643 | |
Mean PPL(base) : 14.009216 ± 0.118474 | |
Cor(ln(PPL(Q)), ln(PPL(base))): 94.72% | |
Mean ln(PPL(Q)/PPL(base)) : 0.186193 ± 0.002783 | |
Mean PPL(Q)/PPL(base) : 1.204655 ± 0.003352 | |
Mean PPL(Q)-PPL(base) : 2.867057 ± 0.050614 | |
====== KL divergence statistics ====== | |
Mean KLD: 0.352343 ± 0.001451 | |
Maximum KLD: 14.400368 | |
99.9% KLD: 5.340890 | |
99.0% KLD: 2.640644 | |
99.0% KLD: 2.640644 | |
Median KLD: 0.186969 | |
10.0% KLD: 0.004785 | |
5.0% KLD: 0.000984 | |
1.0% KLD: 0.000077 | |
Minimum KLD: -0.000004 | |
====== Token probability statistics ====== | |
Mean Δp: -3.054 ± 0.039 % | |
Maximum Δp: 98.346% | |
99.9% Δp: 69.303% | |
99.0% Δp: 36.619% | |
95.0% Δp: 15.962% | |
90.0% Δp: 7.716% | |
75.0% Δp: 0.527% | |
Median Δp: -0.104% | |
25.0% Δp: -4.910% | |
10.0% Δp: -18.843% | |
5.0% Δp: -30.616% | |
1.0% Δp: -58.700% | |
0.1% Δp: -88.795% | |
Minimum Δp: -99.259% | |
RMS Δp : 15.201 ± 0.063 % | |
Same top p: 73.153 ± 0.117 % | |