huggingkot
/

Lumimaid-v0.2-8B-bnb-4bit

8-bit precision

Model card Files Files and versions

huggingkot commited on Mar 6

Commit

66adcf5

·

1 Parent(s): 24b6561

add files

Files changed (2) hide show

.gitattributes +1 -0
README.md +17 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+---
+base_model:
+- NeverSleep/Lumimaid-v0.2-8B
+---
+This is a converted weight from [Lumimaid-v0.2-8B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B) model in [unsloth 4-bit dynamic quant](https://archive.is/EFz7P) using this [collab notebook](https://colab.research.google.com/drive/1P23C66j3ga49kBRnDNlmRce7R_l_-L5l?usp=sharing).
+## About this Conversion
+This conversion uses **Unsloth** to load the model in **4-bit** format and force-save it in the same **4-bit** format.
+### How 4-bit Quantization Works
+- The actual **4-bit quantization** is handled by **BitsAndBytes (bnb)**, which works under **Torch** via **AutoGPTQ** or **BitsAndBytes**.
+- **Unsloth** acts as a wrapper, simplifying and optimizing the process for better efficiency.
+This allows for reduced memory usage and faster inference while keeping the model compact.