|
--- |
|
library_name: mlx |
|
license: apache-2.0 |
|
base_model: mistralai/Mistral-7B-Instruct-v0.3 |
|
extra_gated_description: If you want to learn more about how we process your personal |
|
data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>. |
|
tags: |
|
- vllm |
|
- mistral-common |
|
- mlx |
|
pipeline_tag: text-generation |
|
--- |
|
Mistral-7B-Instruct-v0.3 quantized with mixed precision: |
|
This is a Mistral-7B-Instruct model where the embedding layer and output (head) layer are quantized to 6-bit precision, while the rest of the model uses 4-bit quantization. This mixed-precision approach aims to balance model size and inference speed with improved precision in critical layers. |