Post
1212
📚 Archive of Our Own (AO3) Dataset -
nyuuzyou/archiveofourown
Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring:
- Full text content from diverse fandoms across television, film, books, anime, and more
- Comprehensive metadata including warnings, relationships, characters, and tags
- Multilingual content with works in 40+ languages though English predominant
- Rich classification data preserving author-created folksonomy and content categorization
P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!
Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring:
- Full text content from diverse fandoms across television, film, books, anime, and more
- Comprehensive metadata including warnings, relationships, characters, and tags
- Multilingual content with works in 40+ languages though English predominant
- Rich classification data preserving author-created folksonomy and content categorization
P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!