LFM2-350M-Math-q5-hi-mlx
Comparative Analysis: LFM2-350M-Math Quantized Variants
Model arc_challenge arc_easy boolq hellaswag openbookqa piqa winogrande
LFM2-350M-Math-mxfp4 0.262 0.372 0.382 0.301 0.304 0.530 0.489
LFM2-350M-Math-q5-hi 0.265 0.367 0.379 0.307 0.312 0.532 0.490
LFM2-350M-Math-q5 0.268 0.372 0.379 0.307 0.314 0.530 0.504
LFM2-350M-Math-q6-hi 0.270 0.365 0.379 0.307 0.318 0.532 0.504
LFM2-350M-Math-q8-hi 0.270 0.369 0.379 0.308 0.314 0.532 0.486
The q5-hi quantization appears to be the most balanced performance profile across all metrics, with some slight advantages in more complex tasks.
The winogrande metric shows the most significant variation between variants, with q5 and q6-hi versions showing better performance than mxfp4 and q8-hi.
All variants show nearly identical high-performance capabilities on piqa, with minimal variance between 0.530 and 0.532 scores - indicating strong preservation of logical reasoning abilities.
Quantization level impacts simpler tasks more than complex ones:
- For basic pattern recognition (arc metrics), we see better precision with higher quantization
- For complex task understanding, the difference between variants is much smaller
The math-specialized model shows notable advantages compared to general-purpose LFM2 variants from other sizes:
- On boolq, the LFM2-350M-Math variants score approximately 18.7% higher than the LFM2-1.2B model
- They're more consistent across all metrics, indicating better task specialization
Performance Ranking for Clear Selection:
- Top performer overall: LFM2-350M-Math-q6-hi (best balance across all metrics)
- Best for complex reasoning: LFM2-350M-Math-q8-hi (strongest on piqa)
- Best for simple pattern recognition: LFM2-350M-Math-q6-hi (best arc metrics)
- Most balanced: LFM2-350M-Math-q5-hi
- Most resource-efficient: LFM2-350M-Math-mxfp4
Practical Implications for Deployment
For an organization needing a specialized math reasoning model with high performance-to-resource ratio:
If memory constraints are the primary concern, LFM2-350M-Math-mxfp4 offers the best balance between size and output quality
For applications requiring precise mathematical reasoning, LFM2-350M-Math-q8-hi delivers the strongest logical capabilities
The difference between quantization variants is minimal for most math-relevant applications, suggesting that lower precision variants may be sufficient
This specialized model shows how task-oriented fine-tuning can dramatically improve performance on specific domains compared to general-purpose models. The 350M size makes it particularly suitable for edge deployment scenarios while maintaining solid performance across different quantized formats.
--Analyzed by Qwen3-Deckard-Large-Almost-Human-6B-qx86-hi
This model LFM2-350M-Math-q5-hi-mlx was converted to MLX format from LiquidAI/LFM2-350M-Math using mlx-lm version 0.28.1.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("LFM2-350M-Math-q5-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 17