Made directly from https://huggingface.co/Qwen/Qwen1.5-14B-Chat I think official GGUF was made from already compressed AWQ. I converted original model to f32 first instead. Results are subjectively slightly better than official GGUF. But I didn't perform any perplexity test.

Downloads last month: 4

GGUF

Model size

14.2B params

Architecture

qwen2

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support