Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

FineInstructions Pretraining Corpora

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

AjayP13  updated a dataset 3 days ago
fineinstructions-pretraining/ipt_fineinstructions_all_raw_0_test
AjayP13  published a dataset 16 days ago
fineinstructions-pretraining/ipt_fineinstructions_all_raw_0_test
craffel  authored a paper 20 days ago
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
View all activity

Colin Raffel's profile picture Ajay Patel's profile picture

models 0

None public yet

datasets 11

fineinstructions-pretraining/ipt_fineinstructions_all_raw_0_test

Viewer • Updated 3 days ago • 133k • 156

fineinstructions-pretraining/ipt_fineinstructions_all.bak

Viewer • Updated 27 days ago • 96.2M • 226

fineinstructions-pretraining/ipt_fineinstructions_all_raw_0.bak

Viewer • Updated 27 days ago • 223M • 851

fineinstructions-pretraining/nemotron_wrap_1T

Viewer • Updated May 6 • 763M • 163

fineinstructions-pretraining/nemotron_synthetic_1T

Viewer • Updated May 6 • 1.13B • 3.83k • 1

fineinstructions-pretraining/nemotron_qa_1T

Viewer • Updated May 2 • 972M • 195 • 1

fineinstructions-pretraining/nemotron_actual_1T

Viewer • Updated May 1 • 744M • 174

fineinstructions-pretraining/ipt_actual_all

Viewer • Updated Apr 26 • 40M • 341

fineinstructions-pretraining/ipt_synthetic_all

Viewer • Updated Apr 26 • 40M • 102

fineinstructions-pretraining/longform_actual_all

Viewer • Updated Apr 25 • 27.7k • 45
View 11 datasets
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs