metadata
license: mit
datasets:
- allenai/c4
language:
- en
library_name: transformers
Bingus-v0.1-60M-Base
A not-so-state-of-the-art 60M parameter transformer model.
Uses the olmo default architecture.
Specs
Heads: 8
Layers: 8
Dimension model: 512
Dimension mlp: 4096
eval/v3-small-c4_en-validation/Perplexity: 40.33
Training Data
Pretraining:
- 5B Tokens C4 (preprocessed, from olmo-data.org)