Draft models
Collection
Tiny "draft" models for speculative decoding.
•
10 items
•
Updated
•
1
A 0.5B
parameter draft (speculative decoding) model for use with deepseek-ai/DeepSeek-V3-0324.
See jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0 for the non-GGUF version, and a detailed explanation of how the model was created.
imatrix
imatrix
See DeepSeek-R1-DRAFT-0.5B-v1.0-GGUF for detailed PPL statistics and recommendations on which quant to use, etc.
I have included the imatrix file used to generate the Q4_0
-Q6_K
quants, along with the 1MB sample of the fine-tuning data used to create it.
4-bit
5-bit
6-bit
8-bit
16-bit