stduhpf
/

google-gemma-3-1b-it-qat-q4_0-gguf-small

Model card Files Files and versions Community

stduhpf commited on Apr 7

Commit

5c52406

·

verified ·

1 Parent(s): 2d97c1e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ metrics:
 base_model:
 - google/gemma-3-1b-it-qat-q4_0-gguf
 ---
-This is a requantized version of https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguff.
 The official QAT weights released by google use fp16 (instead of Q6_K) for the embeddings table, which makes this model take a significant extra amount of memory (and storage) compared to what Q4_0 quants are supposed to take.
 ~~Instead of quantizing the table myself, I extracted it from Bartowski's quantized models, because those were already calibrated with imatrix, which should squeeze some extra performance out of it.~~

 base_model:
 - google/gemma-3-1b-it-qat-q4_0-gguf
 ---
+This is a requantized version of https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf.
 The official QAT weights released by google use fp16 (instead of Q6_K) for the embeddings table, which makes this model take a significant extra amount of memory (and storage) compared to what Q4_0 quants are supposed to take.
 ~~Instead of quantizing the table myself, I extracted it from Bartowski's quantized models, because those were already calibrated with imatrix, which should squeeze some extra performance out of it.~~