Mistral-7B-Instruct-v0.2: Local LLM Model Repository
...
This repository provides quantized GGUF and ONNX exports of Mistral-7B-Instruct-v0.2, optimized for efficient local inferenceโespecially on resource-constrained devices like Raspberry Pi.
๐ฆ GGUF Model (Q8_0)
Filename: mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf
Format: GGUF (Q8_0)
Best for: llama.cpp
, koboldcpp
, LM Studio, and similar tools.
Quick Start
./main -m mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf -p "Hello, world!"
This quantized GGUF model is designed for fast, memory-efficient inference on local hardware, including Raspberry Pi and other edge devices.
๐ฆ ONNX Model
Filename: mistral-7b-instruct-v0.2.onnx
Format: ONNX
Best for: ONNX Runtime, Kleidi AI, and compatible frameworks.
Quick Start
import onnxruntime as ort
session = ort.InferenceSession("mistral-7b-instruct-v0.2.onnx")
# ... inference code here ...
The ONNX export enables efficient inference on CPUs, GPUs, and acceleratorsโideal for local deployment.
๐ Credits
- Base model: Mistral AI
- Quantization: llama.cpp
- ONNX export: Optimum, ONNX Runtime
Maintainer: Makatia
- Downloads last month
- 16
Hardware compatibility
Log In
to view the estimation
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support