metadata
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
base_model: meta-llama/Llama-3.2-3B
tags:
- language
- aquif
- text-generation-inference
- math
- coding
- small
languages:
- en
- de
- it
- pt
- fr
- hi
- es
- th
- zh
- ja
language:
- pt
- en
- ja
- zh
- th
- es
- hi
- fr
- de
- it
aquif-3-mini
A high-performance 3.2B parameter language model based on Meta's Llama 3.2 architecture, optimized for efficiency while maintaining strong capabilities across multiple domains including general knowledge, science, mathematics, coding, and multilingual tasks.
Model Details
Base Model: meta-llama/Llama-3.2-3B
Architecture: Llama
Parameter Count: 3.2 billion parameters
Languages: English, German, Italian, Portuguese, French, Hindi, Spanish, Thai, Chinese, Japanese
Performance Benchmarks
Detailed Benchmark Results
Metric | aquif-3-mini (3.2B) | Llama 3.2 (3.2B) | Qwen3 (4B) | Gemma 3n E4B (8.4B) | SmolLM3 (3.1B) | Phi-4 mini (3.8B) | Granite 3.3 (2.5B) |
---|---|---|---|---|---|---|---|
MMLU (General Knowledge) | 67.5 | 63.4 | 67.0 | 64.9 | 59.5 | 67.3 | 55.9 |
GPQA Diamond (Science) | 36.1 | 29.4 | 40.7 | 29.6 | 35.7 | 36.9 | 25.3 |
AIME 2025 (Competition Math) | 9.6 | 0.3 | 17.1 | 11.6 | 9.3 | 10.0 | 2.5 |
LiveCodeBench (Coding) | 15.4 | 8.3 | 23.3 | 14.6 | 15.2 | 12.6 | 9.4 |
Global MMLU (Multilingual) | 58.0 | 46.8 | 65.1 | 53.1 | 53.5 | 49.3 | 49.7 |
IFEval (Instruction Following) | 78.9 | 71.6 | 68.9 | 56.8 | 76.7 | 70.1 | 65.8 |
BFCL Simple (Tool Calling) | 92.3 | 78.6 | 81.3 | 71.8 | 88.8 | 70.3 | 72.2 |
Key Strengths
- Exceptional Tool Calling: Achieves 92.3% on BFCL Simple benchmark, outperforming all comparison models
- Strong Instruction Following: 78.9% on IFEval, demonstrating reliable adherence to complex instructions
- Comprehensive Knowledge: 70.6% on MMLU, matching or exceeding larger models
- Advanced Reasoning: 46.7% on GPQA Diamond, showing strong scientific reasoning capabilities
- Multilingual Competency: Supports 10 languages with competitive performance
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "aquiffoo/aquif-3-mini"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Generate text
inputs = tokenizer("Explain quantum computing:", return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
License
Apache 2.0
Acknowledgements
We gratefully acknowledge:
- Meta AI for the foundational Llama 3.2 architecture and pre-trained weights
- Hugging Face for the transformers library and model hosting platform that enables easy access and deployment
For questions, issues, or collaboration opportunities, please reach out through the Hugging Face model page.