Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Info Tokenizers
non-profit
Activity Feed
Follow
4
AI & ML interests
None defined yet.
Recent Activity
codebyzeb
updated
a model
about 18 hours ago
InfoTokenizers/tokenizers
pietrolesci
authored
a paper
about 2 months ago
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
suchirsalhan
authored
a paper
about 2 months ago
Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies
View all activity
Team members
4
models
9
Sort: Recently updated
InfoTokenizers/tokenizers
Updated
about 18 hours ago
InfoTokenizers/bytelevel-models
Updated
6 days ago
InfoTokenizers/fw57M-tied_finewebedu-20B_bytelevel2
Updated
6 days ago
InfoTokenizers/fw57M-tied_common-corpus_bytelevel2
Updated
6 days ago
InfoTokenizers/fw57M-tied_finewebedu-20B_fw57M_Surprisal_thresholdB_32000
Updated
15 days ago
InfoTokenizers/finewebedu-20B
Updated
20 days ago
InfoTokenizers/fw57M-multi-tied_bytelevel
Updated
21 days ago
InfoTokenizers/fw57M-tied_finewebedu-20B_frequency_32000
Updated
22 days ago
InfoTokenizers/fw57M-tied_finewebedu-20B_bytelevel
Updated
Apr 15
datasets
2
Sort: Recently updated
InfoTokenizers/finewebedu-20B
Viewer
•
Updated
3 days ago
•
81.3M
•
1.01k
InfoTokenizers/common-corpus
Viewer
•
Updated
10 days ago
•
390k
•
295