Llamacpp Quantizations of Qwen3-235B-A22B

Original model: Qwen/Qwen3-235B-A22B-Thinking-2507.

All quants made based on bartowski1182-llama.cpp.

All quants using BF16 convertion from unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF/BF16.

Q2_K : 77.60 GiB (2.84 BPW)

Q4_K_M : 133.27 GiB (4.87 BPW)

Downloads last month
22
GGUF
Model size
235B params
Architecture
qwen3moe
Hardware compatibility
Log In to view the estimation

2-bit

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bobchenyx/Qwen3-235B-A22B-Thinking-2507-GGUF

Quantized
(29)
this model