EXL3 quants of ERNIE-4.5-300B-A47B-PT

2.00 bits per weight
2.10 bits per weight (optimized)
2.25 bits per weight (optimized)
2.50 bits per weight (optimized)
3.00 bits per weight
3.25 bits per weight (optimized)
4.00 bits per weight

Quant Weights/VRAM³ Perplexity KL-div MMLU
2.00 bpw 70.2 GB 7.4131 0.5283
2.10 bpw 73.4 GB 6.7507 0.2202 83.40% ±1.13%¹
2.25 bpw 78.6 GB 6.5576 0.2074 83.70% ±1.13%¹
2.50 bpw 87.8 GB 6.3504 0.1899 83.96%
3.00 bpw 104.9 GB 5.8913 0.1547 84.61%
3.25 bpw 113.3 GB 5.8941 0.0806 86.80% ±1.03%¹
4.00 bpw 139.5 GB 5.8132 0.0717 86.50% ±1.04%¹
2.50 bpw⁴ CCQ 87.4 GB 82.58%²
4.30 bpw⁴ CCQ 147.3 GB 86.16%²
8.13 bpw⁴ CCQ 279.3 GB 86.50%²
Original 597.1 GB 5.4131 86.50%²

¹ 1000 random samples, 95% CI
² From CCQ paper
³ Size of .safetensors files excluding embedding layer
⁴ Average from CCQ layer mix

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including turboderp/ERNIE-4.5-300B-A47B-PT-exl3