gemma-2-9b-it-GGUF / README.md
matrixportal's picture
Upload README.md with huggingface_hub
eedeb29 verified
|
raw
history blame
3.97 kB
metadata
base_model: google/gemma-2-9b-it
library_name: transformers
license: gemma
pipeline_tag: text-generation
tags:
  - conversational
  - llama-cpp
  - matrixportal
extra_gated_heading: Access Gemma on Hugging Face
extra_gated_prompt: >-
  To access Gemma on Hugging Face, you’re required to review and agree to
  Google’s usage license. To do this, please ensure you’re logged in to Hugging
  Face and click below. Requests are processed immediately.
extra_gated_button_content: Acknowledge license

matrixportal/gemma-2-9b-it-GGUF

This model was converted to GGUF format from google/gemma-2-9b-it using llama.cpp via the ggml.ai's all-gguf-same-where space. Refer to the original model card for more details on the model.

✅ Quantized Models Download List

🔍 Recommended Quantizations

  • ✨ General CPU Use: Q4_K_M (Best balance of speed/quality)
  • 📱 ARM Devices: Q4_0 (Optimized for ARM CPUs)
  • 🏆 Maximum Quality: Q8_0 (Near-original quality)

📦 Full Quantization Options

🚀 Download 🔢 Type 📝 Notes
Download Q2_K Basic quantization
Download Q3_K_S Small size
Download Q3_K_M Balanced quality
Download Q3_K_L Better quality
Download Q4_0 Fast on ARM
Download Q4_K_S Fast, recommended
Download Q4_K_M Best balance
Download Q5_0 Good quality
Download Q5_K_S Balanced
Download Q5_K_M High quality
Download Q6_K 🏆 Very good quality
Download Q8_0 Fast, best quality
Download F16 Maximum accuracy

💡 Tip: Use F16 for maximum precision when quality is critical