| ====== Perplexity statistics ====== | |
| Mean PPL(Q) : 17.240743 ± 0.144856 | |
| Mean PPL(base) : 14.009216 ± 0.118474 | |
| Cor(ln(PPL(Q)), ln(PPL(base))): 93.48% | |
| Mean ln(PPL(Q)/PPL(base)) : 0.207560 ± 0.003045 | |
| Mean PPL(Q)/PPL(base) : 1.230671 ± 0.003747 | |
| Mean PPL(Q)-PPL(base) : 3.231527 ± 0.054174 | |
| ====== KL divergence statistics ====== | |
| Mean KLD: 0.435823 ± 0.001620 | |
| Maximum KLD: 15.272707 | |
| 99.9% KLD: 5.783259 | |
| 99.0% KLD: 2.930636 | |
| 99.0% KLD: 2.930636 | |
| Median KLD: 0.250875 | |
| 10.0% KLD: 0.009227 | |
| 5.0% KLD: 0.002172 | |
| 1.0% KLD: 0.000199 | |
| Minimum KLD: 0.000000 | |
| ====== Token probability statistics ====== | |
| Mean Δp: -4.956 ± 0.044 % | |
| Maximum Δp: 99.249% | |
| 99.9% Δp: 68.937% | |
| 99.0% Δp: 36.749% | |
| 95.0% Δp: 14.895% | |
| 90.0% Δp: 6.477% | |
| 75.0% Δp: 0.235% | |
| Median Δp: -0.373% | |
| 25.0% Δp: -7.748% | |
| 10.0% Δp: -24.681% | |
| 5.0% Δp: -37.877% | |
| 1.0% Δp: -66.804% | |
| 0.1% Δp: -92.213% | |
| Minimum Δp: -99.880% | |
| RMS Δp : 17.425 ± 0.066 % | |
| Same top p: 70.195 ± 0.121 % | |