Panchovix
/

Guanaco-65B-GPTQ-32g-actorder

Text Generation

Model card Files Files and versions Community

Guanaco-65B-GPTQ-32g-actorder / README.md

Panchovix's picture

Update README.md

6a545ab about 2 years ago

|

history blame contribute delete

512 Bytes

	---
	license: apache-2.0
	---
	This is a 4-bit GPTQ version of [Guanaco-65B](https://huggingface.co/timdettmers/guanaco-65b-merged)

	It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.

	It may have issues to fit on system with 2x24 GB VRAM cards if using GPTQ-for-LLaMA or AutoGPTQ and max context. Works fine on a single 48GB VRAM card (RTX A6000)

	It works fine with 2x24GB VRAM cards when using exllama/exllama_HF at 2048 context.