Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,10 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
This is a 4-bit GPTQ version of [Guanaco-65B](https://huggingface.co/timdettmers/guanaco-65b-merged)
|
5 |
+
|
6 |
+
It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.
|
7 |
+
|
8 |
+
It may have issues to fit on system with 2x24 GB VRAM cards if using GPTQ-for-LLaMA or AutoGPTQ and max context. Works fine on a single 48GB VRAM card (RTX A6000)
|
9 |
+
|
10 |
+
It works fine with 2x24GB VRAM cards when using exllama/exllama_HF at 2048 context.
|