π€ Qwen3-50M Storyteller
Fine-tuned version of Qwen3-50M specialized for storytelling tasks, trained on the TinyStories dataset.
π Training Results
Loss Metrics
- Final Training Loss: 4.90833215713501
- Final Validation Loss: 4.2213897705078125
- Initial Validation Loss: 7.947038650512695
- Loss Improvement: 3.725648880004883 (46.880970935815526% reduction)
Training Configuration
- Training Epochs: 3
- Learning Rate: 2e-05
- Batch Size: 4
- Max Sequence Length: 512 tokens
- Weight Decay: 0.01
- Warmup Ratio: 0.1
Model Details
- Precision: FP16 (Half Precision)
- Base Model: Mostafa8Mehrabi/qwen3-50m
- Dataset: TinyStories
- Task: Causal Language Modeling (Story Generation)
π Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-50m-storyteller")
model = AutoModelForCausalLM.from_pretrained(
"Mostafa8Mehrabi/qwen3-50m-storyteller",
torch_dtype=torch.float16, # Use fp16 for efficiency
device_map="auto"
)
# Generate a story
prompt = "<|story|>Once upon a time, there was a brave little mouse who"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=200,
do_sample=True,
temperature=0.8,
top_p=0.9,
repetition_penalty=1.1,
pad_token_id=tokenizer.pad_token_id
)
story = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(story)
π Story Format
The model expects stories to be formatted with special tokens:
- Start:
<|story|>
- End:
<|endstory|>
Example:
<|story|>Once upon a time, there was a magical forest where animals could talk...<|endstory|>
π― Intended Use
This model is specifically designed for:
- Children's story generation
- Creative writing assistance
- Educational content creation
- Interactive storytelling applications
β οΈ Limitations
- Optimized for short stories (up to 512 tokens)
- Trained primarily on simple, child-friendly narratives
- May not perform well on other text generation tasks
π Performance
The model shows significant improvement in storytelling capability:
- Validation loss reduced by 46.880970935815526% during training
- Generates coherent, engaging short stories
- Maintains appropriate tone and structure for children's content
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support