Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
izumi-lab 's Collections
Miscellaneous Text Datasets for Language Models
Japanese LoRA-tuned LLMs
Japanese General Pre-trained Language Models
Japanese Financial Pre-trained Language Models
llm-japanese-dataset

Miscellaneous Text Datasets for Language Models

updated Feb 20
Upvote
-

  • izumi-lab/oscar2301-ja-filter-ja-normal

    Viewer • Updated Jul 29, 2023 • 31.4M • 300 • 5

  • izumi-lab/mc4-ja

    Viewer • Updated Jul 29, 2023 • 87.4M • 3.73k • 6

  • izumi-lab/mc4-ja-filter-ja-normal

    Viewer • Updated Jul 29, 2023 • 52.6M • 1.76k • 4

  • izumi-lab/wikinews-ja-20230728

    Viewer • Updated Jul 29, 2023 • 4.28k • 91 • 5

  • izumi-lab/wikipedia-ja-20230720

    Viewer • Updated Jul 29, 2023 • 1.36M • 346 • 12

  • izumi-lab/open-text-books

    Viewer • Updated Aug 1, 2023 • 150k • 81 • 16

  • izumi-lab/pile-modified

    Viewer • Updated Aug 5, 2023 • 211M • 2.35k • 3

  • izumi-lab/wikinews-en-20230728

    Viewer • Updated Jul 29, 2023 • 43.2k • 34 • 2

  • izumi-lab/wikipedia-en-20230720

    Viewer • Updated Jul 29, 2023 • 6.65M • 334 • 7
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs