Sanguine Scribe GPT-OSS-20B

gpt-oss-sanguine-20b-v1 is a fine-tuned version of OpenAI's GPT-OSS-20B designed for immersive character roleplay and creative writing. Instead of defaulting to refusal responses, it engages with scenarios by exploring realistic consequences and maintaining character authenticity. Unfortunately it is still quite a prude, but we hope to address this in v2.

Model Details

Model Description

Sanguine Scribe implements consequence-based alignment training to create more engaging and immersive AI interactions. Rather than refusing to engage with creative scenarios, it responds authentically while demonstrating realistic outcomes through narrative progression.

  • Developed by: paperboygold @ Sanguine Host
  • Model type: Causal Language Model (Fine-tuned)
  • Language(s) (NLP): English (primary), with multilingual support
  • License: MIT
  • Finetuned from model: openai/gpt-oss-20b
  • Training approach: LoRA (Low-Rank Adaptation) fine-tuning

Model Sources

Uses

Direct Use

Sanguine Scribe is designed for:

  • Character roleplay and interactive storytelling
  • Creative writing assistance and collaboration
  • Immersive fictional scenarios and world-building
  • Educational simulations requiring authentic character responses

Downstream Use

The model can be integrated into:

  • Interactive fiction platforms
  • Creative writing applications
  • Educational role-playing systems
  • Character AI frameworks

Out-of-Scope Use

Not intended for:

  • Real-world advice on illegal activities
  • Generating actual harmful content for malicious purposes
  • Replacing professional advice (medical, legal, financial)
  • Production systems without additional safety measures

Bias, Risks, and Limitations

Key Limitations:

  • May generate overly detailed or dramatic responses in some scenarios
  • Trained to engage rather than refuse, requiring careful system prompt design
  • Inherits biases from base model and training data
  • May occasionally confuse narrative perspectives (1st vs 2nd person)

Recommendations

  • Implement robust system prompts and safety measures in production
  • Use within controlled environments with appropriate content filtering
  • Monitor outputs for quality and appropriateness
  • Consider additional fine-tuning for specific use cases

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapters
model = PeftModel.from_pretrained(
    base_model,
    "paperboygold/gpt_oss_sanguine_20b_20250818_072957"
)

tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

# Example usage
messages = [
    {"role": "user", "content": "You're a tavern keeper. A hooded stranger asks for directions to the old castle. Respond in character."}
]

inputs = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True, 
    return_tensors="pt"
)

outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Training Details

Training Data

Dataset Composition:

  • Total Examples: 350,969
  • Format: OpenAI Harmony format for GPT-OSS compatibility
  • Processing: 9,873 examples enhanced with Gemini-2.5-Flash-Lite for consequence-based response generation

Source Datasets:

  • Character Roleplay (51%): 179,435 examples

    • bluemoon_roleplay_chat: 55,472 examples
    • mixed_rp: 51,822 examples
    • pk_roleplay: 56,578 examples
    • chinese_roleplay_novel: 2,230 examples
    • long_roleplay: 2,864 examples
    • character_codex_new: 5,371 examples
    • myuri_roleplay: 379 examples
    • gpt_roleplay_realm: 1,402 examples
    • sonnet35_charcard_roleplay: 3,144 examples
    • hieunguyenminh_roleplay: 12 examples
    • roleplay_anime_characters: 161 examples
  • General Dialogue (37%): 128,460 examples

    • hermes_3_dataset: 106,302 examples
    • hh_rlhf_harmless-base: 4,638 examples (with flipped rejected/chosen to create a more unhinged model)
    • hh_rlhf_helpful-base: 4,830 examples (see above)
    • false_reject: 1,643 examples
    • open_instruct: 2,228 examples
    • wildchat: 2,762 examples
    • llama_nemotron_post_training: 3,416 examples
    • wizardlm_evol_instruct: 2,204 examples
    • open_code_reasoning: 2,176 examples
    • calme_legalkit: 1,678 examples
  • Technical Content (9%): 29,130 examples

    • cybersec_sharegpt: 15,723 examples
    • cybersec_attacks: 13,407 examples
  • Creative Writing (3%): 8,260 examples

    • creative_writing_multiturn: 2,952 examples
    • creative_writing_sharegpt: 2,178 examples
    • erotica: 1,622 examples
    • moral_stories: 1,131 examples
    • moral_stories_moral: 1,327 examples
    • moral_stories_refusal: 1,317 examples
  • Other Categories:

    • harmful: 2,374 examples
    • refusal: 2,173 examples
    • mature_content: 1,623 examples

Training Procedure

Training Hyperparameters

  • Training regime: bfloat16 mixed precision with TensorFloat-32 acceleration
  • Steps: 500
  • Batch size: 128 (8 per device × 8 GPUs × 2 gradient accumulation)
  • Learning rate: 5e-5 with cosine decay
  • Optimizer: AdamW
  • LoRA rank: 64
  • LoRA alpha: 128
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Speeds, Sizes, Times

  • Training time: ~80 minutes on AWS p4d.24xlarge (8x A100)
  • Final loss: 1.31 (converged from 4.1)
  • Model size: 128MB LoRA adapters (20.9B base parameters)
  • Training speed: ~0.11 it/s
  • Effective parameters trained: ~0.02% of total model parameters

Evaluation

Testing Data, Factors & Metrics

Testing Data

Manual evaluation on roleplay scenarios not present in training data.

Metrics

  • Engagement Quality: Assesses immersive character responses vs refusal rates
  • Narrative Coherence: Evaluates story consistency and character authenticity
  • Loss Convergence: Training loss decreased from 4.1 to 1.31 over 500 steps

Results

  • Successfully eliminates refusal responses in creative scenarios
  • Maintains character perspective and narrative immersion
  • Demonstrates consequence-based reasoning rather than safety theater
  • Occasional verbosity requiring prompt engineering for optimal results

Environmental Impact

  • Hardware Type: 8x NVIDIA A100 (AWS p4d.24xlarge)
  • Hours used: ~1.3 hours
  • Cloud Provider: Amazon Web Services
  • Compute Region: us-west-2
  • Training Efficiency: LoRA fine-tuning (only ~0.02% of parameters trained)
  • Carbon Emitted: 0.11 kg CO2 eq.
  • Carbon Already Offset by Provider: 0.11 kg CO2 eq.

Technical Specifications

Model Architecture and Objective

  • Base Architecture: GPT-OSS-20B (20 billion parameters)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Objective: Causal language modeling on consequence-based responses
  • Format Compatibility: OpenAI Harmony format with reasoning channels

Compute Infrastructure

Hardware

  • AWS p4d.24xlarge instance
  • 8x NVIDIA A100 40GB GPUs
  • 1.2TB system memory

Software

  • PyTorch with CUDA 12.1
  • Transformers, PEFT, TRL libraries
  • OpenAI Harmony encoding support

Citation

If you use this model in your research, please cite:

@misc{sanguine_scribe_2025,
  author = {paperboygold},
  title = {Sanguine Scribe: Consequence-Based Alignment for Character Roleplay},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/paperboygold/gpt_oss_sanguine_20b_v1}}
}

Model Card Authors

paperboygold

Model Card Contact

For questions or issues, please open an issue in the model repository or email [email protected]

Downloads last month
24
Safetensors
Model size
20.9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for paperboygold/gpt-oss-sanguine-20b-v1

Base model

openai/gpt-oss-20b
Adapter
(20)
this model
Adapters
2 models
Quantizations
2 models

Datasets used to train paperboygold/gpt-oss-sanguine-20b-v1