Upload folder using huggingface_hub

4197cf0 verified 26 days ago

2.94 kB

metadata

language:
  - en
base_model:
  - mistralai/Devstral-Small-2507
pipeline_tag: text-generation
tags:
  - mistral
  - neuralmagic
  - redhat
  - llmcompressor
  - quantized
  - INT8
  - compressed-tensors
license: mit
license_name: mit
name: RedHatAI/Devstral-Small-2507
description: >-
  This model was obtained by quantizing weights and activations of
  Devstral-Small-2507 to INT8 data type.
readme: >-
  https://huggingface.co/RedHatAI/Devstral-Small-2507-quantized.w8a8/main/README.md
tasks:
  - text-to-text
provider: mistralai

Devstral-Small-2507-quantized.w8a8

Model Overview

Model Architecture: MistralForCausalLM
- Input: Text
- Output: Text
Model Optimizations:
- Activation quantization: INT8
- Weight quantization: INT8
Release Date: 08/29/2025
Version: 1.0
Model Developers: Red Hat (Neural Magic)

Model Optimizations

This model was obtained by quantizing weights and activations of Devstral-Small-2507 to INT8 data type. This optimization reduces the number of bits used to represent weights and activations from 16 to 8, reducing GPU memory requirements (by approximately 50%). Weight quantization also reduces disk size requirements by approximately 50%.

Deployment

This model can be deployed efficiently using the vLLM backend, as shown in the example below.

vllm serve RedHatAI/Devstral-Small-2507-quantized.w8a8 --tensor-parallel-size 1 --tokenizer_mode mistral

Evaluation

The model was evaluated on popular coding tasks (HumanEval, HumanEval+, MBPP, MBPP+) via EvalPlus and vllm backend (v0.10.1.1). For evaluations, we run greedy sampling and report pass@1. The command to reproduce evals:

evalplus.evaluate --model "RedHatAI/Devstral-Small-2507-quantized.w8a8" \
                  --dataset [humaneval|mbpp] \
                  --base-url http://localhost:8000/v1 \
                  --backend openai --greedy

Accuracy

	Recovery (%)	mistralai/Devstral-Small-2507	RedHatAI/Devstral-Small-2507-quantized.w8a8 (this model)
HumanEval	100.67	89.0	89.6
HumanEval+	101.48	81.1	82.3
MBPP	98.71	77.5	76.5
MBPP+	102.42	66.1	67.7
Average Score	100.77	78.43	79.03