Panchovix commited on
Commit
6a545ab
·
1 Parent(s): c8255ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -1,3 +1,10 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ This is a 4-bit GPTQ version of [Guanaco-65B](https://huggingface.co/timdettmers/guanaco-65b-merged)
5
+
6
+ It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.
7
+
8
+ It may have issues to fit on system with 2x24 GB VRAM cards if using GPTQ-for-LLaMA or AutoGPTQ and max context. Works fine on a single 48GB VRAM card (RTX A6000)
9
+
10
+ It works fine with 2x24GB VRAM cards when using exllama/exllama_HF at 2048 context.