Gemma 3 4B Instruct Quantized Models

This repo offers quantized versions of google/gemma-3-4b-it for use with llama.cpp. Quantization was done using an unofficial Docker image and calibrated on 100 rows from the agentlans/LinguaNova dataset to maintain coherence and multilingual support. The importance matrix file is included.

Limitations

  • Optimized for multilingual natural language tasks.
  • May underperform on math, coding, and untested multimodal features.
  • Shares all limitations and biases of the original Gemma 3 models.

Notes

  • Ideal for resource-constrained environments.
  • Test on your data for best results.
  • See the original google/gemma-3-4b-it page for full details and guidelines.

This card covers only the quantized models.

Downloads last month
0
GGUF
Model size
3.88B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for agentlans/gemma-3-4b-it-GGUF

Quantized
(99)
this model

Dataset used to train agentlans/gemma-3-4b-it-GGUF