codelion commited on
Commit
bef5be4
·
verified ·
1 Parent(s): 8aadccf

v0.1.0: 6-metric Capability Score + OptIQ vs U4 deltas

Browse files
Files changed (1) hide show
  1. README.md +28 -16
README.md CHANGED
@@ -17,7 +17,7 @@ tags:
17
 
18
  # mlx-community/gemma-4-e4b-it-OptiQ-4bit
19
 
20
- A 4-bit mixed-precision MLX quant produced by [mlx-optiq](https://mlx-optiq.com/), the sensitivity-aware quantization toolkit for Apple Silicon.
21
 
22
  A 4-bit mixed-precision MLX quant of [google/gemma-4-e4b-it](https://huggingface.co/google/gemma-4-e4b-it). Per-layer bit-widths come from a KL-divergence sensitivity pass on a [six-domain calibration mix](https://mlx-optiq.com/blog/calibration-mix) (prose · reasoning · code · agent · tool-call · constraint-bearing instructions). Sensitive layers go to 8-bit; robust ones stay at 4-bit. The on-disk size is within ~5 % of a stock uniform 4-bit MLX quant.
23
 
@@ -32,8 +32,9 @@ A 4-bit mixed-precision MLX quant of [google/gemma-4-e4b-it](https://huggingface
32
  | Group size | 64 |
33
  | Calibration mix | [six-domain mix](https://mlx-optiq.com/blog/calibration-mix) (40 samples × 6 domains) |
34
  | Reference for sensitivity | bf16 (auto-resolved; falls back to uniform-4-bit if bf16 doesn't fit) |
 
35
 
36
- We follow the same naming convention `llama.cpp` uses for Q4_K_M and similar mixed-precision quants: the "4-bit" label is for the predominant precision, not the weighted average. The mixed allocation is what lets this build beat stock uniform-4-bit at the same disk size. Benchmark deltas are below.
37
 
38
  ## Usage
39
 
@@ -60,24 +61,35 @@ For more (mixed-precision KV-cache serving, sensitivity-aware LoRA fine-tuning,
60
  pip install mlx-optiq
61
  ```
62
 
63
- See the [Gemma-4 family guide](https://mlx-optiq.com/docs/gemma-4) on [mlx-optiq.com](https://mlx-optiq.com/) for sampling defaults, training recipes, and family-specific caveats.
 
 
 
 
 
 
 
64
 
65
- ## Benchmarks
66
 
67
- Five-metric suite that drives the [Capability Score](https://mlx-optiq.com/blog/eval-framework):
68
 
69
- | Metric | Score |
70
- |---|---:|
71
- | MMLU (5-shot, 1000 samples) | 58.8% |
72
- | GSM8K (1000 samples, 3-shot CoT) | 77.8% |
73
- | IFEval (full set, strict) | 70.6% |
74
- | BFCL-V3 simple (200 single-turn calls) | 69.0% |
75
- | HumanEval (164 problems, pass@1) | 76.8% |
76
- | **Capability Score** (mean of the 5 benchmarks above) | **70.6** |
77
- | KL vs bf16 reference (mean / p95) | 0.2755 / 1.3460 |
78
- | On-disk size | 6.1 GB |
79
 
80
- The Capability Score is the simple unweighted mean of the five benchmarks. Every metric gets one equal vote. Disk size is reported next to it as an honest second axis instead of being folded into the score. See the [eval-framework writeup](https://mlx-optiq.com/blog/eval-framework) for the full methodology.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ## Links
83
 
 
17
 
18
  # mlx-community/gemma-4-e4b-it-OptiQ-4bit
19
 
20
+ A 4-bit mixed-precision MLX quant produced by [mlx-optiq](https://mlx-optiq.com/) the sensitivity-aware quantization toolkit for Apple Silicon. Beats stock uniform 4-bit on every benchmark in the six-metric Capability Score.
21
 
22
  A 4-bit mixed-precision MLX quant of [google/gemma-4-e4b-it](https://huggingface.co/google/gemma-4-e4b-it). Per-layer bit-widths come from a KL-divergence sensitivity pass on a [six-domain calibration mix](https://mlx-optiq.com/blog/calibration-mix) (prose · reasoning · code · agent · tool-call · constraint-bearing instructions). Sensitive layers go to 8-bit; robust ones stay at 4-bit. The on-disk size is within ~5 % of a stock uniform 4-bit MLX quant.
23
 
 
32
  | Group size | 64 |
33
  | Calibration mix | [six-domain mix](https://mlx-optiq.com/blog/calibration-mix) (40 samples × 6 domains) |
34
  | Reference for sensitivity | bf16 (auto-resolved; falls back to uniform-4-bit if bf16 doesn't fit) |
35
+ | Speculative drafter | served with [`mlx-community/gemma-4-e4b-it-assistant-bf16`](https://huggingface.co/mlx-community/gemma-4-e4b-it-assistant-bf16) via `optiq serve --drafter` |
36
 
37
+ We follow the same naming convention `llama.cpp` uses for Q4_K_M and similar mixed-precision quants: the "4-bit" label is for the predominant precision, not the weighted average. The mixed allocation is what lets this build beat stock uniform-4-bit on every benchmark below at the same disk size.
38
 
39
  ## Usage
40
 
 
61
  pip install mlx-optiq
62
  ```
63
 
64
+ ### Speculative decoding (assistant drafter)
65
+
66
+ Gemma-4 ships a separate small drafter for speculative decoding. Pair this quant with [`mlx-community/gemma-4-e4b-it-assistant-bf16`](https://huggingface.co/mlx-community/gemma-4-e4b-it-assistant-bf16) for faster decode:
67
+
68
+ ```bash
69
+ optiq serve --model mlx-community/gemma-4-e4b-it-OptiQ-4bit \
70
+ --drafter mlx-community/gemma-4-e4b-it-assistant-bf16
71
+ ```
72
 
 
73
 
74
+ See the [Gemma-4 family guide](https://mlx-optiq.com/docs/gemma-4) on [mlx-optiq.com](https://mlx-optiq.com/) for sampling defaults, training recipes, and family-specific caveats.
75
 
76
+ ## Benchmarks
 
 
 
 
 
 
 
 
 
77
 
78
+ Six-metric Capability Score (mean of MMLU + GSM8K + IFEval + BFCL + HumanEval + HashHop). Apples-to-apples comparison against stock uniform 4-bit:
79
+
80
+ | Metric | OptIQ | Uniform 4-bit | Δ |
81
+ |---|---:|---:|---:|
82
+ | MMLU (5-shot, 1000 samples) | **58.8%** | 52.9% | +5.9 |
83
+ | GSM8K (1000 samples, 3-shot CoT) | **77.8%** | 46.1% | +31.7 |
84
+ | IFEval (full set, strict) | **70.6%** | 68.6% | +2.0 |
85
+ | BFCL-V3 simple (200 calls) | **69.0%** | 67.5% | +1.5 |
86
+ | HumanEval (164 problems, pass@1) | **76.8%** | 58.5% | +18.3 |
87
+ | HashHop (long-context retrieval) | **42.0%** | 20.0% | +22.0 |
88
+ | **Capability Score** (mean of 6) | **65.84** | 52.28 | **+13.57** |
89
+ | KL vs bf16 reference (mean / p95) | 0.2755 / 1.3460 | — | — |
90
+ | On-disk size | 6.1 GB | 4.9 GB | +1.2 |
91
+
92
+ Every metric gets one equal vote. Disk size is reported next to the score as an honest second axis instead of being folded into the score. See the [eval-framework writeup](https://mlx-optiq.com/blog/eval-framework) for the full methodology.
93
 
94
  ## Links
95