stduhpf commited on
Commit
5c52406
·
verified ·
1 Parent(s): 2d97c1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -5,7 +5,7 @@ metrics:
5
  base_model:
6
  - google/gemma-3-1b-it-qat-q4_0-gguf
7
  ---
8
- This is a requantized version of https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguff.
9
 
10
  The official QAT weights released by google use fp16 (instead of Q6_K) for the embeddings table, which makes this model take a significant extra amount of memory (and storage) compared to what Q4_0 quants are supposed to take.
11
  ~~Instead of quantizing the table myself, I extracted it from Bartowski's quantized models, because those were already calibrated with imatrix, which should squeeze some extra performance out of it.~~
 
5
  base_model:
6
  - google/gemma-3-1b-it-qat-q4_0-gguf
7
  ---
8
+ This is a requantized version of https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf.
9
 
10
  The official QAT weights released by google use fp16 (instead of Q6_K) for the embeddings table, which makes this model take a significant extra amount of memory (and storage) compared to what Q4_0 quants are supposed to take.
11
  ~~Instead of quantizing the table myself, I extracted it from Bartowski's quantized models, because those were already calibrated with imatrix, which should squeeze some extra performance out of it.~~