EXL3 models
Collection
17 items
•
Updated
•
17
EXL3 quants of Qwen3-30B-A3B
2.25 bits per weight
3.00 bits per weight
4.00 bits per weight
5.00 bits per weight
6.00 bits per weight
While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:
Model | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 |
---|---|---|---|
2.25 bpw | 88.41% | 0.1416 | 84.78% |
3.00 bpw | 89.63% | 0.0688 | 89.44% |
4.00 bpw | 92.07% | 0.0215 | 94.33% |
5.00 bpw | 93.29% | 0.0094 | 96.24% |
6.00 bpw | 92.68% | 0.0054 | 97.45% |
FP16 | 91.46% | - | - |