bartowski
/

google_gemma-3-27b-it-qat-GGUF

@@ -1,6 +1,6 @@
 ---
 quantized_by: bartowski
-pipeline_tag: text-generation
 tags:
 - gemma3
 - gemma
@@ -9,14 +9,19 @@ license: gemma
 extra_gated_button_content: Acknowledge license
 base_model_relation: quantized
 extra_gated_heading: Access Gemma on Hugging Face
-extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
-  agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
   Face and click below. Requests are processed immediately.
 base_model: google/gemma-3-27b-it-qat-q4_0-unquantized
 ---
 ## Llamacpp imatrix Quantizations of gemma-3-27b-it-qat by google
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
 Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized
@@ -53,7 +58,7 @@ Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or a
 | [gemma-3-27b-it-qat-Q4_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_L.gguf) | Q4_K_L | 16.89GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
 | [gemma-3-27b-it-qat-Q4_K_M.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_M.gguf) | Q4_K_M | 16.55GB | false | Good quality, default size for most use cases, *recommended*. |
 | [gemma-3-27b-it-qat-Q4_K_S.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_S.gguf) | Q4_K_S | 15.67GB | false | Slightly lower quality with more space savings, *recommended*. |
-| [gemma-3-27b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_0.gguf) | Q4_0 | 15.62GB | false | Legacy format, offers online repacking for ARM and AVX CPU inference. |
 | [gemma-3-27b-it-qat-IQ4_NL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_NL.gguf) | IQ4_NL | 15.57GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
 | [gemma-3-27b-it-qat-Q3_K_XL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q3_K_XL.gguf) | Q3_K_XL | 14.88GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
 | [gemma-3-27b-it-qat-IQ4_XS.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_XS.gguf) | IQ4_XS | 14.77GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
@@ -172,4 +177,4 @@ Thank you ZeroWw for the inspiration to experiment with embed/output.
 Thank you to LM Studio for sponsoring my work.
-Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

 ---
 quantized_by: bartowski
+pipeline_tag: image-text-to-text
 tags:
 - gemma3
 - gemma
 extra_gated_button_content: Acknowledge license
 base_model_relation: quantized
 extra_gated_heading: Access Gemma on Hugging Face
+extra_gated_prompt: >-
+  To access Gemma on Hugging Face, you’re required to review and agree to
+  Google’s usage license. To do this, please ensure you’re logged in to Hugging
   Face and click below. Requests are processed immediately.
 base_model: google/gemma-3-27b-it-qat-q4_0-unquantized
 ---
 ## Llamacpp imatrix Quantizations of gemma-3-27b-it-qat by google
+These are derived from the QAT (quantized aware training) weights provided by Google
+*ONLY* Q4_0 is expected to be better, but figured while I'm at it I might as well make others to see what happens?
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
 Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized
 | [gemma-3-27b-it-qat-Q4_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_L.gguf) | Q4_K_L | 16.89GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
 | [gemma-3-27b-it-qat-Q4_K_M.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_M.gguf) | Q4_K_M | 16.55GB | false | Good quality, default size for most use cases, *recommended*. |
 | [gemma-3-27b-it-qat-Q4_K_S.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_S.gguf) | Q4_K_S | 15.67GB | false | Slightly lower quality with more space savings, *recommended*. |
+| [gemma-3-27b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_0.gguf) | Q4_0 | 15.62GB | false | Should be improved due to QAT, offers online repacking for ARM and AVX CPU inference. |
 | [gemma-3-27b-it-qat-IQ4_NL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_NL.gguf) | IQ4_NL | 15.57GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
 | [gemma-3-27b-it-qat-Q3_K_XL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q3_K_XL.gguf) | Q3_K_XL | 14.88GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
 | [gemma-3-27b-it-qat-IQ4_XS.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_XS.gguf) | IQ4_XS | 14.77GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 Thank you to LM Studio for sponsoring my work.
+Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski