Cass-Beta1.3: From-Scratch Meme-Teen AI Transformer
Cass-Beta1.3 is a fully-from-scratch Transformer language model with a PG-13 meme-teen personality. It does not use any pretrained weightsโall knowledge comes from auto-generated personality prompts and adaptive learning from user interactions.
Model Overview
- Architecture: GPT-2 style Transformer (
GPT2LMHeadModel
) - Parameters: Small and lightweight (~1 million parameters) suitable for 12 GB GPU
- Tokenizer: Custom BPE tokenizer trained from scratch
- Training Data:
- 100 auto-generated personality prompts (PG-13, meme-teen)
- Incrementally updated with user chat memory for adaptive learning
- Personality: Funny, chill, slang-heavy, PG-13
- Memory Learning: Model fine-tunes itself every 10 user messages, adapting to user style
Intended Use
- Personal chatbot with a meme-teen style
- Text generation for PG-13 contexts
- Educational/demo purposes for small-scale Transformer training
Limitations
- Small parameter count โ limited reasoning capability
- Slang-heavy personality may produce nonsensical or repetitive output
- Memory learning is local to user interactions; may overfit short-term style
- Lookup functionality is simulated; no live web access
Files Included
File | Description |
---|---|
pytorch_model.bin |
Model weights (from scratch) |
config.json |
Model configuration and hyperparameters |
tokenizer.json |
Custom BPE tokenizer |
tokenizer_config.json |
Tokenizer configuration for Hugging Face |
special_tokens_map.json |
Mapping for special tokens (<pad> , <s> , </s> , <unk> ) |
cass_memory.json |
Optional saved user chats for adaptive learning |
Usage Example
from transformers import GPT2LMHeadModel, PreTrainedTokenizerFast
# Load model
model = GPT2LMHeadModel.from_pretrained("DSDUDEd/Cass-Beta1.3")
tokenizer = PreTrainedTokenizerFast.from_pretrained("DSDUDEd/Cass-Beta1.3")
# Encode user input
input_text = "yo cass, what's up?"
inputs = tokenizer(input_text, return_tensors="pt")
# Generate reply
outputs = model.generate(**inputs, max_length=32, do_sample=True, temperature=0.8)
reply = tokenizer.decode(outputs[0])
print("Cass:", reply)