Llamacpp Quantizations of gpt-oss-20b

Original model: Adopting F16 from unsloth/gpt-oss-20b-GGUF.

MXFP4_MOE quant made with update in this PR llama.cpp #15091

MXFP4_MOE : 11.27 GiB (4.63 BPW)


Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/gpt-oss-20b-GGUF",
    local_dir = "bobchenyx/gpt-oss-20b-GGUF",
    allow_patterns = ["*MXFP4_MOE*"],
)
Downloads last month
261
GGUF
Model size
20.9B params
Architecture
gpt-oss
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for bobchenyx/gpt-oss-20b-GGUF

Base model

openai/gpt-oss-20b
Quantized
(108)
this model

Collection including bobchenyx/gpt-oss-20b-GGUF