dgomes03
/

Mistral-7B-Instruct-v0.3-mixed-4-6-bit

Text Generation

4-bit precision

Model card Files Files and versions

Mistral-7B-Instruct-v0.3-mixed-4-6-bit / README.md

dgomes03's picture

Update README.md

5fc00a0 verified 17 days ago

|

history blame contribute delete

675 Bytes

	---
	library_name: mlx
	license: apache-2.0
	base_model: mistralai/Mistral-7B-Instruct-v0.3
	extra_gated_description: If you want to learn more about how we process your personal
	data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
	tags:
	- vllm
	- mistral-common
	- mlx
	pipeline_tag: text-generation
	---
	Mistral-7B-Instruct-v0.3 quantized with mixed precision:
	This is a Mistral-7B-Instruct model where the embedding layer and output (head) layer are quantized to 6-bit precision, while the rest of the model uses 4-bit quantization. This mixed-precision approach aims to balance model size and inference speed with improved precision in critical layers.