bartowski
/

google_gemma-3-27b-it-qat-GGUF

Image-Text-to-Text

Model card Files Files and versions

bartowski commited on Apr 18

Commit

4a99534

·

verified ·

1 Parent(s): 22d8d55

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -22,6 +22,8 @@ These are derived from the QAT (quantized aware training) weights provided by Go
 *ONLY* Q4_0 is expected to be better, but figured while I'm at it I might as well make others to see what happens?
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
 Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized

 *ONLY* Q4_0 is expected to be better, but figured while I'm at it I might as well make others to see what happens?
+[gemma-3-27b-it-qat-Q4_0.gguf](https://huggingface.co/bartowski/google_gemma-3-27b-it-qat-GGUF/blob/main/google_gemma-3-27b-it-qat-Q4_0.gguf) | Q4_0 | 15.62GB | false | Should be improved due to QAT, offers online repacking for ARM and AVX CPU inference.
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5147">b5147</a> for quantization.
 Original model: https://huggingface.co/google/gemma-3-27b-it-qat-q4_0-unquantized