EXL3 quants of Qwen3-30B-A3B

2.25 bits per weight
3.00 bits per weight
4.00 bits per weight
5.00 bits per weight
6.00 bits per weight

While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:

Model HumanEval pass@1 KL-div vs FP16 (wiki2 20k tokens) Top-1 agreement vs FP16
2.25 bpw 88.41% 0.1416 84.78%
3.00 bpw 89.63% 0.0688 89.44%
4.00 bpw 92.07% 0.0215 94.33%
5.00 bpw 93.29% 0.0094 96.24%
6.00 bpw 92.68% 0.0054 97.45%
FP16 91.46% - -
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including turboderp/Qwen3-30B-A3B-exl3