GPT-OSS 20B — Children QLoRA (Adapter)

QLoRA adapter for openai/gpt-oss-20b fine-tuned on children’s stories to produce structured JSON outputs suitable for bedtime content and educational demos.

Author: @garethpaul
Base model: openai/gpt-oss-20b
Training data: garethpaul/children-stories-dataset
Format: PEFT LoRA adapter (not full weights)
License: MIT

✨ What this model does

Generates friendly, positive children’s bedtime stories with the following JSON schema:

{
  "title": "string",
  "characters": ["string"],
  "setting": "string",
  "story": "string (500–800 words, bedtime tone)",
  "moral": "string"
}

🚀 Quickstart (Transformers + PEFT)

Note: vLLM’s GPT-OSS backend does not (currently) load LoRA for GptOssForCausalLM. Use transformers+peft for the adapter; or merge + export MXFP4 for vLLM.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE   = "openai/gpt-oss-20b"
ADAPTER = "garethpaul/gpt-oss-20b-children-qlora"

tokenizer = AutoTokenizer.from_pretrained(BASE)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

## Load base, then attach adapter
model = AutoModelForCausalLM.from_pretrained(
    BASE,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()

system = "You are StoryWeaver. Respond ONLY in valid JSON with keys: {title, characters, setting, story, moral}."
messages = [
    {"role": "system", "content": system},
    {"role": "user", "content": "Tell me a bedtime story about a brave little car."}
]

# Use chat template → then tokenize to get attention_mask
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
enc = tokenizer(prompt, return_tensors="pt", return_attention_mask=True).to(model.device)

with torch.no_grad():
    out = model.generate(**enc, max_new_tokens=700, temperature=0.7, top_p=0.9)

print(tokenizer.decode(out[0], skip_special_tokens=True))

🧩 How to merge (optional)

If you want a single checkpoint (e.g., to share without PEFT):

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

BASE    = "openai/gpt-oss-20b"
ADAPTER = "garethpaul/gpt-oss-20b-children-qlora"
SAVE_TO = "./gpt-oss-20b-children-merged"

tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(model, ADAPTER)
merged = model.merge_and_unload()
merged.save_pretrained(SAVE_TO)
tok.save_pretrained(SAVE_TO)

For vLLM GPT-OSS serving: re-export the merged weights to MXFP4 (GPT-OSS layout) before hosting.

✅ Intended uses

Generating kid-safe bedtime stories with clear morals.

Producing structured JSON for downstream apps (mobile readers, voice apps, curriculum tools).

🔧 Training details

Method: QLoRA (PEFT), r=8, lora_alpha=16, lora_dropout≈0.05, bias=none Targets: GPT-OSS linear layers (MoE aware); started with target_modules="all-linear" Base: openai/gpt-oss-20b (MoE; attention unquantized; MXFP4 dequantize for training) Frameworks: transformers, peft, trl (SFTTrainer) Objective: Supervised fine-tuning to produce JSON stories (500–800 words) Typical SFT args (example): bf16=True, gradient_checkpointing=True, batch size 1 with grad accumulation, cosine schedule with min lr rate 0.1, context up to 2048.

📚 Data

Primary: garethpaul/children-stories-dataset (human + synthetic)

Formatting: chat messages prompting JSON schema; bedtime tone; positive ending.

🔬 Evaluation (qualitative)

Manual spot-checks for:

JSON validity and required keys.
Word count (500–800) adherence.
Bedtime tone & positive moral.

(If you later log structured evals—JSON pass rate, average word count, toxicity checks—add them under model-index.results.)

🏗 Technical specs

Architecture: GPT-OSS 20B MoE (low active params) Context window: 8192 tokens (prompt + output) Adapter size: ~16MB (safetensors, PEFT)

Framework versions:

transformers ≈ 4.56
peft ≈ 0.12
trl ≈ 0.9
accelerate ≈ 0.34

📄 Citation

@misc{gpt-oss-20b-children-qlora, author = {Gareth Paul}, title = {GPT-OSS 20B — Children QLoRA (Adapter)}, year = {2025}, howpublished = {\url{https://huggingface.co/garethpaul/gpt-oss-20b-children-qlora}} }

Contact: ping @garethpaul on the Hub.

Downloads last month: 41

Model tree for garethpaul/gpt-oss-20b-children-qlora

Base model

openai/gpt-oss-20b

Adapter

(73)

this model

Dataset used to train garethpaul/gpt-oss-20b-children-qlora

Evaluation results

Metadata error: specify a dataset to view leaderboard