license: mit | |
datasets: | |
- allenai/c4 | |
language: | |
- en | |
library_name: transformers | |
# Bingus-v0.1-60M-Base | |
A not-so-state-of-the-art 60M parameter transformer model. | |
Uses the olmo default architecture. | |
### Specs | |
Heads: 8 | |
Layers: 8 | |
Dimension model: 512 | |
Dimension mlp: 4096 | |
eval/v3-small-c4_en-validation/Perplexity: 40.33 | |
### Training Data | |
Pretraining: | |
- 5B Tokens C4 (preprocessed, from olmo-data.org) |