Mistral-Small-3.1-DRAFT-0.5B

This model is meant to be used as draft model for speculative decoding with mistralai/Mistral-Small-3.1-24B-Instruct-2503 or mistralai/Mistral-Small-24B-Instruct-2501

Data info

The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.

Downloads last month: 4

Safetensors

Model size

0.6B params

Tensor type

BF16

F8_E4M3

Model tree for kavin1337/Mistral-Small-3.1-DRAFT-0.5B-FP8-Dynamic

Base model

Qwen/Qwen2.5-0.5B

Finetuned

alamios/Qwenstral-Small-3.1-0.5B

Finetuned

alamios/Mistral-Small-3.1-DRAFT-0.5B

Quantized

(12)

this model

kavin1337
/

Mistral-Small-3.1-DRAFT-0.5B-FP8-Dynamic

Mistral-Small-3.1-DRAFT-0.5B

Data info

Model tree for kavin1337/Mistral-Small-3.1-DRAFT-0.5B-FP8-Dynamic

Dataset used to train kavin1337/Mistral-Small-3.1-DRAFT-0.5B-FP8-Dynamic