huggingkot
/

Llama-3-8B-Instruct-abliterated-v2-bnb-4bit

8-bit precision

Model card Files Files and versions

Llama-3-8B-Instruct-abliterated-v2-bnb-4bit / README.md

huggingkot

add files

247b3b9 7 months ago

|

history blame contribute delete

920 Bytes


	---
	base_model:
	- cognitivecomputations/Llama-3-8B-Instruct-abliterated-v2
	---

	This is a converted weight from [Llama-3-8B-Instruct-abliterated-v2](https://huggingface.co/cognitivecomputations/Llama-3-8B-Instruct-abliterated-v2) model in [unsloth 4-bit dynamic quant](https://archive.is/EFz7P) using this [collab notebook](https://colab.research.google.com/drive/1P23C66j3ga49kBRnDNlmRce7R_l_-L5l?usp=sharing).

	## About this Conversion

	This conversion uses Unsloth to load the model in 4-bit format and force-save it in the same 4-bit format.

	### How 4-bit Quantization Works
	- The actual 4-bit quantization is handled by BitsAndBytes (bnb), which works under Torch via AutoGPTQ or BitsAndBytes.
	- Unsloth acts as a wrapper, simplifying and optimizing the process for better efficiency.

	This allows for reduced memory usage and faster inference while keeping the model compact.