asynctales/Qwen2.5-Coder-3B-Instruct-Q6_K-GGUF

This model was converted to GGUF format from Qwen/Qwen2.5-Coder-3B-Instruct using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

path\to\llama-server.exe -m path\to\qwen2.5-coder-3b-instruct-q6_k.gguf -ngl 99 -fa -ub 1024 -b 1024 --ctx-size 0 --cache-reuse 256 -np 2 --port [port] --temp 0.5(or customize ur temp.)

Downloads last month: 97

GGUF

Model size

3.09B params

Architecture

qwen2

Hardware compatibility

6-bit

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for asynctales/Qwen2.5-Coder-3B-Instruct-Q6_K-GGUF

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-Coder-3B

Finetuned

Qwen/Qwen2.5-Coder-3B-Instruct

Quantized

(84)

this model