Qwen3-4B-2507-abliterated-GGUF

The Huihui-Qwen3-4B-Instruct-2507-abliterated model is an uncensored, proof-of-concept version of the Qwen3-4B-Instruct-2507 large language model, created using a novel abliteration method designed to remove refusal responses without using TransformerLens. This approach offers a faster and more effective way to bypass the model's standard refusal behaviors, resulting in a less filtered and more raw output experience, though it lacks rigorous safety filtering and may generate sensitive or controversial content. The model is quantized (4-bit) for efficient use, can be used directly in Hugging Face’s transformers library, and is intended primarily for research or experimental use rather than production due to the reduced content restrictions and associated risks. Users are advised to carefully monitor outputs and ensure ethical and legal compliance when deploying this model.

Qwen3-4B-2507-abliterated-GGUF (GGUF Formats)

Model Variant Link
Qwen3-4B-Thinking-2507-abliterated-GGUF Hugging Face
Qwen3-4B-Instruct-2507-abliterated-GGUF Hugging Face

Model Files

Qwen3-4B-Thinking-2507-abliterated

File Name Size Quant Type
Qwen3-4B-Thinking-2507-abliterated.BF16.gguf 8.05 GB BF16
Qwen3-4B-Thinking-2507-abliterated.F16.gguf 8.05 GB F16
Qwen3-4B-Thinking-2507-abliterated.F32.gguf 16.1 GB F32
Qwen3-4B-Thinking-2507-abliterated.Q2_K.gguf 1.67 GB Q2_K
Qwen3-4B-Thinking-2507-abliterated.Q3_K_L.gguf 2.24 GB Q3_K_L
Qwen3-4B-Thinking-2507-abliterated.Q3_K_M.gguf 2.08 GB Q3_K_M
Qwen3-4B-Thinking-2507-abliterated.Q3_K_S.gguf 1.89 GB Q3_K_S
Qwen3-4B-Thinking-2507-abliterated.Q4_K_M.gguf 2.5 GB Q4_K_M
Qwen3-4B-Thinking-2507-abliterated.Q4_K_S.gguf 2.38 GB Q4_K_S
Qwen3-4B-Thinking-2507-abliterated.Q5_K_M.gguf 2.89 GB Q5_K_M
Qwen3-4B-Thinking-2507-abliterated.Q5_K_S.gguf 2.82 GB Q5_K_S
Qwen3-4B-Thinking-2507-abliterated.Q6_K.gguf 3.31 GB Q6_K
Qwen3-4B-Thinking-2507-abliterated.Q8_0.gguf 4.28 GB Q8_0

Qwen3-4B-Instruct-2507-abliterated

File Name Size Quant Type
Qwen3-4B-Instruct-2507-abliterated.BF16.gguf 8.05 GB BF16
Qwen3-4B-Instruct-2507-abliterated.F16.gguf 8.05 GB F16
Qwen3-4B-Instruct-2507-abliterated.F32.gguf 16.1 GB F32
Qwen3-4B-Instruct-2507-abliterated.Q2_K.gguf 1.67 GB Q2_K
Qwen3-4B-Instruct-2507-abliterated.Q3_K_L.gguf 2.24 GB Q3_K_L
Qwen3-4B-Instruct-2507-abliterated.Q3_K_M.gguf 2.08 GB Q3_K_M
Qwen3-4B-Instruct-2507-abliterated.Q3_K_S.gguf 1.89 GB Q3_K_S
Qwen3-4B-Instruct-2507-abliterated.Q4_K_M.gguf 2.5 GB Q4_K_M
Qwen3-4B-Instruct-2507-abliterated.Q4_K_S.gguf 2.38 GB Q4_K_S
Qwen3-4B-Instruct-2507-abliterated.Q5_K_M.gguf 2.89 GB Q5_K_M
Qwen3-4B-Instruct-2507-abliterated.Q5_K_S.gguf 2.82 GB Q5_K_S
Qwen3-4B-Instruct-2507-abliterated.Q6_K.gguf 3.31 GB Q6_K
Qwen3-4B-Instruct-2507-abliterated.Q8_0.gguf 4.28 GB Q8_0

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
5,968
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3-4B-2507-abliterated-GGUF

Collection including prithivMLmods/Qwen3-4B-2507-abliterated-GGUF