dgomes03
/

Mistral-7B-Instruct-v0.3-mixed-4-6-bit

Text Generation

4-bit precision

Model card Files Files and versions

dgomes03 commited on 17 days ago

Commit

41f890f

·

verified ·

1 Parent(s): 703b5ee

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -10,3 +10,5 @@ tags:
 - mlx
 pipeline_tag: text-generation
 ---

 - mlx
 pipeline_tag: text-generation
 ---
+Mistral-7B-Instruct-v0.3 quantized with mixed precision:
+This is a Mistral-7B-Instruct model where the embedding layer and output (head) layer are quantized to 6-bit precision, while the rest of the model uses 4-bit quantization. This mixed-precision approach aims to balance model size and inference speed with improved representational fidelity in critical layers.