Mistral-7B-Instruct-v0.2: Local LLM Model Repository

...

This repository provides quantized GGUF and ONNX exports of Mistral-7B-Instruct-v0.2, optimized for efficient local inferenceโ€”especially on resource-constrained devices like Raspberry Pi.


๐Ÿฆ™ GGUF Model (Q8_0)

Filename: mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf
Format: GGUF (Q8_0)
Best for: llama.cpp, koboldcpp, LM Studio, and similar tools.

Quick Start

./main -m mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf -p "Hello, world!"

This quantized GGUF model is designed for fast, memory-efficient inference on local hardware, including Raspberry Pi and other edge devices.


๐ŸŸฆ ONNX Model

Filename: mistral-7b-instruct-v0.2.onnx
Format: ONNX
Best for: ONNX Runtime, Kleidi AI, and compatible frameworks.

Quick Start

import onnxruntime as ort

session = ort.InferenceSession("mistral-7b-instruct-v0.2.onnx")
# ... inference code here ...

The ONNX export enables efficient inference on CPUs, GPUs, and acceleratorsโ€”ideal for local deployment.


๐Ÿ“‹ Credits


Maintainer: Makatia

Downloads last month
16
GGUF
Model size
7.24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support