KT313 commited on
Commit
5872d14
·
verified ·
1 Parent(s): 56cc2c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -13,10 +13,12 @@ A not-so-state-of-the-art 60M parameter transformer model.
13
  Uses the olmo default architecture.
14
 
15
  ### Specs
16
- - Heads: 8
17
- - Layers: 8
18
- - Dimension model: 512
19
- - Dimension mlp: 4096
 
 
20
 
21
  ### Training Data
22
  Pretraining:
 
13
  Uses the olmo default architecture.
14
 
15
  ### Specs
16
+ Heads: 8
17
+ Layers: 8
18
+ Dimension model: 512
19
+ Dimension mlp: 4096
20
+
21
+ eval/v3-small-c4_en-validation/Perplexity: 40.33
22
 
23
  ### Training Data
24
  Pretraining: