Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
GRMenon 's Collections
BigBanyanTree CommonCrawl Data (2018 - 2024)

BigBanyanTree CommonCrawl Data (2018 - 2024)

updated Jun 15

A collection of processed CommonCrawl data as part of the BigBanyanTree initiative. Each dataset is extracted from a random 1% sample of the data.

Upvote
-

  • big-banyan-tree/BBT_CommonCrawl_2018

    Viewer • Updated Oct 11, 2024 • 61.5M • 79 • 3

  • big-banyan-tree/BBT_CommonCrawl_2019

    Viewer • Updated Oct 11, 2024 • 55.8M • 49 • 2

  • big-banyan-tree/BBT_CommonCrawl_2020

    Viewer • Updated Oct 11, 2024 • 46.9M • 62 • 2

  • big-banyan-tree/BBT_CommonCrawl_2021

    Viewer • Updated Oct 11, 2024 • 48.5M • 71 • 2

  • big-banyan-tree/BBT_CommonCrawl_2022

    Viewer • Updated Oct 11, 2024 • 14.2M • 78 • 2

  • big-banyan-tree/BBT_CommonCrawl_2023

    Viewer • Updated Oct 11, 2024 • 44M • 93 • 2

  • big-banyan-tree/BBT_CommonCrawl_2024

    Viewer • Updated Oct 11, 2024 • 33.6M • 341 • 4
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs