asynctales/Qwen2.5-Coder-3B-Instruct-Q6_K-GGUF

This model was converted to GGUF format from Qwen/Qwen2.5-Coder-3B-Instruct using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

path\to\llama-server.exe -m path\to\qwen2.5-coder-3b-instruct-q6_k.gguf -ngl 99 -fa -ub 1024 -b 1024 --ctx-size 0 --cache-reuse 256 -np 2 --port [port] --temp 0.5(or customize ur temp.)
Downloads last month
97
GGUF
Model size
3.09B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for asynctales/Qwen2.5-Coder-3B-Instruct-Q6_K-GGUF

Base model

Qwen/Qwen2.5-3B
Quantized
(84)
this model