jukofyork
/

DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

speculative-decoding

Model card Files Files and versions Community

A 0.5B parameter draft (speculative decoding) model for use with deepseek-ai/DeepSeek-V3-0324.

See jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0 for the non-GGUF version, and a detailed explanation of how the model was created.

Without `imatrix`

Link	Type
DeepSeek-V3-0324-DRAFT-0.5B-BF16.gguf	BF16
DeepSeek-V3-0324-DRAFT-0.5B-F16.gguf	F16
DeepSeek-V3-0324-DRAFT-0.5B-Q8_0.gguf	Q8_0
DeepSeek-V3-0324-DRAFT-0.5B-Q6_K.gguf	Q6_K
DeepSeek-V3-0324-DRAFT-0.5B-Q5_K_M.gguf	Q5_K_M
DeepSeek-V3-0324-DRAFT-0.5B-Q5_K_S.gguf	Q5_K_S
DeepSeek-V3-0324-DRAFT-0.5B-Q4_K_M.gguf	Q4_K_M
DeepSeek-V3-0324-DRAFT-0.5B-Q4_K_S.gguf	Q4_K_S
DeepSeek-V3-0324-DRAFT-0.5B-IQ4_NL.gguf	IQ4_NL
DeepSeek-V3-0324-DRAFT-0.5B-IQ4_XS.gguf	IQ4_XS
DeepSeek-V3-0324-DRAFT-0.5B-Q5_1.gguf	Q5_1
DeepSeek-V3-0324-DRAFT-0.5B-Q5_0.gguf	Q5_0
DeepSeek-V3-0324-DRAFT-0.5B-Q4_1.gguf	Q4_1
DeepSeek-V3-0324-DRAFT-0.5B-Q4_0.gguf	Q4_0

With `imatrix`

Link	Type
DeepSeek-V3-0324-DRAFT-0.5B-iQ6_K.gguf	Q6_K
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_K_M.gguf	Q5_K_M
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_K_S.gguf	Q5_K_S
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_K_M.gguf	Q4_K_M
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_K_S.gguf	Q4_K_S
DeepSeek-V3-0324-DRAFT-0.5B-iIQ4_NL.gguf	IQ4_NL
DeepSeek-V3-0324-DRAFT-0.5B-iIQ4_XS.gguf	IQ4_XS
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_1.gguf	Q5_1
DeepSeek-V3-0324-DRAFT-0.5B-iQ5_0.gguf	Q5_0
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_1.gguf	Q4_1
DeepSeek-V3-0324-DRAFT-0.5B-iQ4_0.gguf	Q4_0

See DeepSeek-R1-DRAFT-0.5B-v1.0-GGUF for detailed PPL statistics and recommendations on which quant to use, etc.

I have included the imatrix file used to generate the Q4_0-Q6_K quants, along with the 1MB sample of the fine-tuning data used to create it.

Downloads last month: 1,544

GGUF

Model size

501M params

Architecture

qwen2

Hardware compatibility

Log In to view the estimation

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Quantized

(122)

this model

Datasets used to train jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

Collection including jukofyork/DeepSeek-V3-0324-DRAFT-0.5B-v1.0-GGUF

Draft models

Tiny "draft" models for speculative decoding. • 10 items • Updated 15 days ago • 1