====== Perplexity statistics ====== Mean PPL(Q) : 8.459379 ± 0.053550 Mean PPL(base) : 7.237090 ± 0.045539 Cor(ln(PPL(Q)), ln(PPL(base))): 97.26% Mean ln(PPL(Q)/PPL(base)) : 0.156057 ± 0.001477 Mean PPL(Q)/PPL(base) : 1.168892 ± 0.001727 Mean PPL(Q)-PPL(base) : 1.222289 ± 0.014061 ====== KL divergence statistics ====== Mean KLD: 0.131196 ± 0.000539 Maximum KLD: 7.898368 99.9% KLD: 2.475934 99.0% KLD: 0.894390 99.0% KLD: 0.894390 Median KLD: 0.089346 10.0% KLD: 0.006670 5.0% KLD: 0.002250 1.0% KLD: 0.000392 Minimum KLD: 0.000001 ====== Token probability statistics ====== Mean Δp: -3.913 ± 0.027 % Maximum Δp: 64.023% 99.9% Δp: 33.075% 99.0% Δp: 17.214% 95.0% Δp: 7.245% 90.0% Δp: 3.301% 75.0% Δp: 0.096% Median Δp: -0.875% 25.0% Δp: -6.342% 10.0% Δp: -15.578% 5.0% Δp: -22.649% 1.0% Δp: -41.943% 0.1% Δp: -75.852% Minimum Δp: -98.926% RMS Δp : 10.892 ± 0.050 % Same top p: 82.438 ± 0.100 %