GPT-OSS 20B — Children QLoRA (Adapter)
QLoRA adapter for openai/gpt-oss-20b
fine-tuned on children’s stories to produce structured JSON outputs suitable for bedtime content and educational demos.
- Author: @garethpaul
- Base model:
openai/gpt-oss-20b
- Training data:
garethpaul/children-stories-dataset
- Format: PEFT LoRA adapter (not full weights)
- License: MIT
✨ What this model does
Generates friendly, positive children’s bedtime stories with the following JSON schema:
{
"title": "string",
"characters": ["string"],
"setting": "string",
"story": "string (500–800 words, bedtime tone)",
"moral": "string"
}
🚀 Quickstart (Transformers + PEFT)
Note: vLLM’s GPT-OSS backend does not (currently) load LoRA for GptOssForCausalLM. Use transformers+peft for the adapter; or merge + export MXFP4 for vLLM.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE = "openai/gpt-oss-20b"
ADAPTER = "garethpaul/gpt-oss-20b-children-qlora"
tokenizer = AutoTokenizer.from_pretrained(BASE)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
## Load base, then attach adapter
model = AutoModelForCausalLM.from_pretrained(
BASE,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()
system = "You are StoryWeaver. Respond ONLY in valid JSON with keys: {title, characters, setting, story, moral}."
messages = [
{"role": "system", "content": system},
{"role": "user", "content": "Tell me a bedtime story about a brave little car."}
]
# Use chat template → then tokenize to get attention_mask
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
enc = tokenizer(prompt, return_tensors="pt", return_attention_mask=True).to(model.device)
with torch.no_grad():
out = model.generate(**enc, max_new_tokens=700, temperature=0.7, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))
🧩 How to merge (optional)
If you want a single checkpoint (e.g., to share without PEFT):
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
BASE = "openai/gpt-oss-20b"
ADAPTER = "garethpaul/gpt-oss-20b-children-qlora"
SAVE_TO = "./gpt-oss-20b-children-merged"
tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(model, ADAPTER)
merged = model.merge_and_unload()
merged.save_pretrained(SAVE_TO)
tok.save_pretrained(SAVE_TO)
For vLLM GPT-OSS serving: re-export the merged weights to MXFP4 (GPT-OSS layout) before hosting.
✅ Intended uses
Generating kid-safe bedtime stories with clear morals.
Producing structured JSON for downstream apps (mobile readers, voice apps, curriculum tools).
🔧 Training details
Method: QLoRA (PEFT), r=8, lora_alpha=16, lora_dropout≈0.05, bias=none Targets: GPT-OSS linear layers (MoE aware); started with target_modules="all-linear" Base: openai/gpt-oss-20b (MoE; attention unquantized; MXFP4 dequantize for training) Frameworks: transformers, peft, trl (SFTTrainer) Objective: Supervised fine-tuning to produce JSON stories (500–800 words) Typical SFT args (example): bf16=True, gradient_checkpointing=True, batch size 1 with grad accumulation, cosine schedule with min lr rate 0.1, context up to 2048.
📚 Data
Primary: garethpaul/children-stories-dataset (human + synthetic)
Formatting: chat messages prompting JSON schema; bedtime tone; positive ending.
🔬 Evaluation (qualitative)
Manual spot-checks for:
- JSON validity and required keys.
- Word count (500–800) adherence.
- Bedtime tone & positive moral.
(If you later log structured evals—JSON pass rate, average word count, toxicity checks—add them under model-index.results.)
🏗 Technical specs
Architecture: GPT-OSS 20B MoE (low active params) Context window: 8192 tokens (prompt + output) Adapter size: ~16MB (safetensors, PEFT)
Framework versions:
- transformers ≈ 4.56
- peft ≈ 0.12
- trl ≈ 0.9
- accelerate ≈ 0.34
📄 Citation
@misc{gpt-oss-20b-children-qlora, author = {Gareth Paul}, title = {GPT-OSS 20B — Children QLoRA (Adapter)}, year = {2025}, howpublished = {\url{https://huggingface.co/garethpaul/gpt-oss-20b-children-qlora}} }
Contact: ping @garethpaul on the Hub.
- Downloads last month
- 41
Model tree for garethpaul/gpt-oss-20b-children-qlora
Base model
openai/gpt-oss-20b