|
--- |
|
license: apache-2.0 |
|
--- |
|
This is a 4-bit GPTQ version of [Guanaco-65B](https://huggingface.co/timdettmers/guanaco-65b-merged) |
|
|
|
It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model. |
|
|
|
It may have issues to fit on system with 2x24 GB VRAM cards if using GPTQ-for-LLaMA or AutoGPTQ and max context. Works fine on a single 48GB VRAM card (RTX A6000) |
|
|
|
It works fine with 2x24GB VRAM cards when using exllama/exllama_HF at 2048 context. |