KT313 commited on
Commit
19df664
·
verified ·
1 Parent(s): 5fda600

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -2,16 +2,17 @@
2
  license: mit
3
  ---
4
 
5
- A not-so-state-of-the-art 60M parameter transformer model.
6
 
 
7
  Uses the olmo default architecture.
8
 
 
9
  - Heads: 8
10
  - Layers: 8
11
  - Dimension model: 512
12
  - Dimension mlp: 4096
13
 
14
- Training Data:
15
-
16
  Pretraining:
17
  - 5B Tokens C4 (preprocessed, from olmo-data.org)
 
2
  license: mit
3
  ---
4
 
5
+ # Bingus-v0.1-60M-Base
6
 
7
+ A not-so-state-of-the-art 60M parameter transformer model.
8
  Uses the olmo default architecture.
9
 
10
+ ### Specs
11
  - Heads: 8
12
  - Layers: 8
13
  - Dimension model: 512
14
  - Dimension mlp: 4096
15
 
16
+ ### Training Data
 
17
  Pretraining:
18
  - 5B Tokens C4 (preprocessed, from olmo-data.org)