EXL3 models
Collection
26 items
•
Updated
•
28
EXL3 quants of ERNIE-4.5-300B-A47B-PT
2.00 bits per weight
2.10 bits per weight (optimized)
2.25 bits per weight (optimized)
2.50 bits per weight (optimized)
3.00 bits per weight
3.25 bits per weight (optimized)
4.00 bits per weight
Quant | Weights/VRAM³ | Perplexity | KL-div | MMLU |
---|---|---|---|---|
2.00 bpw | 70.2 GB | 7.4131 | 0.5283 | |
2.10 bpw | 73.4 GB | 6.7507 | 0.2202 | 83.40% ±1.13%¹ |
2.25 bpw | 78.6 GB | 6.5576 | 0.2074 | 83.70% ±1.13%¹ |
2.50 bpw | 87.8 GB | 6.3504 | 0.1899 | 83.96% |
3.00 bpw | 104.9 GB | 5.8913 | 0.1547 | 84.61% |
3.25 bpw | 113.3 GB | 5.8941 | 0.0806 | 86.80% ±1.03%¹ |
4.00 bpw | 139.5 GB | 5.8132 | 0.0717 | 86.50% ±1.04%¹ |
2.50 bpw⁴ CCQ | 87.4 GB | 82.58%² | ||
4.30 bpw⁴ CCQ | 147.3 GB | 86.16%² | ||
8.13 bpw⁴ CCQ | 279.3 GB | 86.50%² | ||
Original | 597.1 GB | 5.4131 | 86.50%² |
¹ 1000 random samples, 95% CI
² From CCQ paper
³ Size of .safetensors files excluding embedding layer
⁴ Average from CCQ layer mix