Llamacpp Quantizations of gpt-oss-20b

Original model: Adopting F16 from unsloth/gpt-oss-20b-GGUF.

MXFP4_MOE quant made with update in this PR llama.cpp #15091

MXFP4_MOE : 11.27 GiB (4.63 BPW)

Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/gpt-oss-20b-GGUF",
    local_dir = "bobchenyx/gpt-oss-20b-GGUF",
    allow_patterns = ["*MXFP4_MOE*"],
)

Downloads last month: 261

GGUF

Model size

20.9B params

Architecture

gpt-oss

Hardware compatibility

4-bit

Model tree for bobchenyx/gpt-oss-20b-GGUF

Base model

openai/gpt-oss-20b

Quantized

(108)

this model

Collection including bobchenyx/gpt-oss-20b-GGUF

gpt-oss

Collection

OpenAI's gpt-oss-20b and gpt-oss-120b • 2 items • Updated 2 days ago