lgcharpe commited on
Commit
3609f9d
·
verified ·
1 Parent(s): 929e896

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -96,7 +96,7 @@ We used the BabyLM 10M (Strict-small) dataset to train the model. It is composed
96
  | Sequence Length | 128 → 512 |
97
  | Batch Size (in tokens) | 16 384 |
98
  | Learning Rate | 0.007 |
99
- | Number of Steps | 9914 |
100
  | Warmup Ratio | 1.6% |
101
  | Cooldown Ratio | 1.6% |
102
  | Mask Ratio | 0.3 → 0.15 |
 
96
  | Sequence Length | 128 → 512 |
97
  | Batch Size (in tokens) | 16 384 |
98
  | Learning Rate | 0.007 |
99
+ | Number of Steps | 9 914 |
100
  | Warmup Ratio | 1.6% |
101
  | Cooldown Ratio | 1.6% |
102
  | Mask Ratio | 0.3 → 0.15 |