Gemma 3 4B Instruct Quantized Models
This repo offers quantized versions of google/gemma-3-4b-it for use with llama.cpp. Quantization was done using an unofficial Docker image and calibrated on 100 rows from the agentlans/LinguaNova dataset to maintain coherence and multilingual support. The importance matrix file is included.
Limitations
- Optimized for multilingual natural language tasks.
- May underperform on math, coding, and untested multimodal features.
- Shares all limitations and biases of the original Gemma 3 models.
Notes
- Ideal for resource-constrained environments.
- Test on your data for best results.
- See the original google/gemma-3-4b-it page for full details and guidelines.
This card covers only the quantized models.
- Downloads last month
- 0
Hardware compatibility
Log In
to view the estimation
4-bit
5-bit
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support