Papers
arxiv:2504.20605

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

Published on Apr 29
· Submitted by mihainadas on May 2
Authors:
,

Abstract

Moral stories are a time-tested vehicle for transmitting values, yet modern NLP lacks a large, structured corpus that couples coherent narratives with explicit ethical lessons. We close this gap with TF1-EN-3M, the first open dataset of three million English-language fables generated exclusively by instruction-tuned models no larger than 8B parameters. Each story follows a six-slot scaffold (character -> trait -> setting -> conflict -> resolution -> moral), produced through a combinatorial prompt engine that guarantees genre fidelity while covering a broad thematic space. A hybrid evaluation pipeline blends (i) a GPT-based critic that scores grammar, creativity, moral clarity, and template adherence with (ii) reference-free diversity and readability metrics. Among ten open-weight candidates, an 8B-parameter Llama-3 variant delivers the best quality-speed trade-off, producing high-scoring fables on a single consumer GPU (<24 GB VRAM) at approximately 13.5 cents per 1,000 fables. We release the dataset, generation code, evaluation scripts, and full metadata under a permissive license, enabling exact reproducibility and cost benchmarking. TF1-EN-3M opens avenues for research in instruction following, narrative intelligence, value alignment, and child-friendly educational AI, demonstrating that large-scale moral storytelling no longer requires proprietary giant models.

Community

Paper author Paper submitter

🦊📚 Introducing TF1-EN-3M — Three Million Synthetic Moral Fables for Small Open-Weight LLMs

We’ve just released TF1-EN-3M, the largest open corpus of machine-generated moral fables to date — and it was created entirely with models no larger than 8B parameters. 🎉

📄 TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models))


🌟 Why Another Story Dataset?

  • Existing collections such as Aesop’s Fables top out at a few hundred examples — far too small for today’s data-hungry models.
  • Most educational, on-device, or open-source projects can’t deploy 70B-parameter giants.
  • We asked: Can compact, fully open models (< 8B) generate a massive, high-quality, ethics-focused story corpus that anyone can fine-tune?

📦 What’s Inside TF1-EN-3M?

Feature Details
Size 3,000,000 English fables (≈ 1B tokens)
Structure Six-slot scaffold: character → trait → setting → conflict → resolution → moral
Audience Written for 4–7-year-olds (simple vocabulary, explicit morals)
Metadata Prompt, model name, token counts, latency, GPU type & cost per story
License CC-BY-4.0 — free to remix, filter, or extend

👉 Dataset on the Hub: klusai/ds-tf1-en-3m


🤖 One-Paragraph Generation Recipe

A combinatorial engine expands six curated lists (100 options each) into millions of unique prompts.
Ten open-weight instruction models (1B–8B) compete; we score Grammar, Creativity, Moral Clarity, and Prompt Adherence with a gpt-o3-mini critic, plus Self-BLEU & Distinct-1 diversity checks.
LLaMA-3.1-8B-Instruct wins — great quality, tiny VRAM footprint, and costs < $0.0005 per story on an L40S GPU.
All code lives in the public tinyfabulist repo.


🔍 Quick Quality Peek

  • Mean critic score: 7.8 / 10 (four axes)
  • Age fit: 80% tagged “Age B” (4–7 yrs)
  • Diversity: Self-BLEU 0.31 • Distinct-1 0.16
from datasets import load_dataset, disable_caching
disable_caching()
ds = load_dataset("klusai/ds-tf1-en-3m", split="train[:3%]")
print(ds.shuffle(seed=42)[0]["fable"])

🛠️ What Can You Do With It?

  • Fine-tune tiny LMs (1–3B) into bedtime-story generators that run on phones or edge devices.
  • Build moral-inference benchmarks: given a fable, predict its lesson.
  • Train alignment critics to verify kid-safe morals in generated text.
  • Translate the prompt lists and spawn French, Hindi, or Swahili mega-fable sets in a weekend GPU sprint.

Paper: The TF1-EN-3M Synthetic Fables Dataset: Large-Scale Story Generation with Small Open Models
Authors: Mihai Nădaș, Laura Dioșan, Andreea Tomescu & Andrei Pișcoran (KlusAI Labs & Babeș-Bolyai University)

Happy storytelling! 🎈

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.20605 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.20605 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.