asynctales/Qwen2.5-Coder-3B-Instruct-Q6_K-GGUF
This model was converted to GGUF format from Qwen/Qwen2.5-Coder-3B-Instruct
using llama.cpp via the ggml.ai's GGUF-my-repo space.
Refer to the original model card for more details on the model.
path\to\llama-server.exe -m path\to\qwen2.5-coder-3b-instruct-q6_k.gguf -ngl 99 -fa -ub 1024 -b 1024 --ctx-size 0 --cache-reuse 256 -np 2 --port [port] --temp 0.5(or customize ur temp.)
- Downloads last month
- 97
Hardware compatibility
Log In
to view the estimation
6-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support