Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

datablations

https://github.com/huggingface/datablations
Activity Feed Request to join this org

AI & ML interests

Scaling Data-Constrained Language Models

Recent Activity

craffel  authored a paper about 23 hours ago
Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models
srush  authored a paper 15 days ago
Composer 2 Technical Report
Muennighoff  submitted a paper 3 months ago
Composer 2 Technical Report
View all activity

Niklas Muennighoff's profile pictureTeven Le Scao's profile pictureNouamane Tazi's profile pictureRisto Luukkonen's profile pictureAleksandra Piktus's profile pictureSampo Pyysalo's profile pictureColin Raffel's profile pictureThomas Wolf's profile pictureSasha Rush's profile picture

datablations 's datasets 13

datablations/scripts

Viewer • Updated Jun 15, 2023 • 3.48M • 580

datablations/oscar-subsets

Viewer • Updated Jun 14, 2023 • 365k • 1.29k

datablations/c4-subsets

Viewer • Updated Jun 14, 2023 • 729k • 1.18k • 6

datablations/c4-filter-megatron

Updated May 28, 2023 • 1.81k

datablations/oscar-filter-megatron

Updated May 27, 2023 • 762

datablations/python-megatron

Updated May 22, 2023 • 4.36k • 1

datablations/subsets

Viewer • Updated May 10, 2023 • 365k • 64

datablations/oscar-filter

Viewer • Updated May 10, 2023 • 432M • 3.03k

datablations/oscar-dedup-expanded

Viewer • Updated May 10, 2023 • 432M • 993 • 1

datablations/mup

Updated Apr 24, 2023 • 269

datablations/c4-filter

Viewer • Updated Feb 1, 2023 • 365M • 395

datablations/c4-filter-small

Viewer • Updated Jan 17, 2023 • 100k • 52

datablations/oscar-filter-small

Viewer • Updated Nov 24, 2022 • 100k • 19
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs