bartowski commited on
Commit
22d8d55
·
verified ·
1 Parent(s): 78f43e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  quantized_by: bartowski
3
- pipeline_tag: text-generation
4
  tags:
5
  - gemma3
6
  - gemma
@@ -9,14 +9,19 @@ license: gemma
9
  extra_gated_button_content: Acknowledge license
10
  base_model_relation: quantized
11
  extra_gated_heading: Access Gemma on Hugging Face
12
- extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
13
- agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
 
14
  Face and click below. Requests are processed immediately.
15
  base_model: google/gemma-3-27b-it-qat-q4_0-unquantized
16
  ---
17
 
18
  ## Llamacpp imatrix Quantizations of gemma-3-27b-it-qat by google
19
 
 
 
 
 
20
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
21
 
22
  Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized
@@ -53,7 +58,7 @@ Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or a
53
  | [gemma-3-27b-it-qat-Q4_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_L.gguf) | Q4_K_L | 16.89GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
54
  | [gemma-3-27b-it-qat-Q4_K_M.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_M.gguf) | Q4_K_M | 16.55GB | false | Good quality, default size for most use cases, *recommended*. |
55
  | [gemma-3-27b-it-qat-Q4_K_S.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_S.gguf) | Q4_K_S | 15.67GB | false | Slightly lower quality with more space savings, *recommended*. |
56
- | [gemma-3-27b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_0.gguf) | Q4_0 | 15.62GB | false | Legacy format, offers online repacking for ARM and AVX CPU inference. |
57
  | [gemma-3-27b-it-qat-IQ4_NL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_NL.gguf) | IQ4_NL | 15.57GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
58
  | [gemma-3-27b-it-qat-Q3_K_XL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q3_K_XL.gguf) | Q3_K_XL | 14.88GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
59
  | [gemma-3-27b-it-qat-IQ4_XS.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_XS.gguf) | IQ4_XS | 14.77GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
@@ -172,4 +177,4 @@ Thank you ZeroWw for the inspiration to experiment with embed/output.
172
 
173
  Thank you to LM Studio for sponsoring my work.
174
 
175
- Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
 
1
  ---
2
  quantized_by: bartowski
3
+ pipeline_tag: image-text-to-text
4
  tags:
5
  - gemma3
6
  - gemma
 
9
  extra_gated_button_content: Acknowledge license
10
  base_model_relation: quantized
11
  extra_gated_heading: Access Gemma on Hugging Face
12
+ extra_gated_prompt: >-
13
+ To access Gemma on Hugging Face, you’re required to review and agree to
14
+ Google’s usage license. To do this, please ensure you’re logged in to Hugging
15
  Face and click below. Requests are processed immediately.
16
  base_model: google/gemma-3-27b-it-qat-q4_0-unquantized
17
  ---
18
 
19
  ## Llamacpp imatrix Quantizations of gemma-3-27b-it-qat by google
20
 
21
+ These are derived from the QAT (quantized aware training) weights provided by Google
22
+
23
+ *ONLY* Q4_0 is expected to be better, but figured while I'm at it I might as well make others to see what happens?
24
+
25
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
26
 
27
  Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized
 
58
  | [gemma-3-27b-it-qat-Q4_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_L.gguf) | Q4_K_L | 16.89GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
59
  | [gemma-3-27b-it-qat-Q4_K_M.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_M.gguf) | Q4_K_M | 16.55GB | false | Good quality, default size for most use cases, *recommended*. |
60
  | [gemma-3-27b-it-qat-Q4_K_S.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_K_S.gguf) | Q4_K_S | 15.67GB | false | Slightly lower quality with more space savings, *recommended*. |
61
+ | [gemma-3-27b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_0.gguf) | Q4_0 | 15.62GB | false | Should be improved due to QAT, offers online repacking for ARM and AVX CPU inference. |
62
  | [gemma-3-27b-it-qat-IQ4_NL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_NL.gguf) | IQ4_NL | 15.57GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
63
  | [gemma-3-27b-it-qat-Q3_K_XL.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q3_K_XL.gguf) | Q3_K_XL | 14.88GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
64
  | [gemma-3-27b-it-qat-IQ4_XS.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-IQ4_XS.gguf) | IQ4_XS | 14.77GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 
177
 
178
  Thank you to LM Studio for sponsoring my work.
179
 
180
+ Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski