Update README.md
Browse files
README.md
CHANGED
@@ -62,3 +62,5 @@ TODO:
|
|
62 |
|
63 |
**INT_N isn't the equivalent or a match for fair comparison. It is 16.3% faster and 13% smaller in this scenario.**
|
64 |
- [Issue Link](https://github.com/microsoft/T-MAC/issues/79)
|
|
|
|
|
|
62 |
|
63 |
**INT_N isn't the equivalent or a match for fair comparison. It is 16.3% faster and 13% smaller in this scenario.**
|
64 |
- [Issue Link](https://github.com/microsoft/T-MAC/issues/79)
|
65 |
+
|
66 |
+
AutoGPTQ is used, by default it uses groupsize of 128: making it less bpw and smaller than llama.cpp. https://qwen.readthedocs.io/en/latest/quantization/gptq.html
|