BabyLM-community
/

babylm-baseline-10m-gpt-bert-mixed

babylm-baseline

Model card Files Files and versions Community

lgcharpe commited on 3 days ago

Commit

3609f9d

·

verified ·

1 Parent(s): 929e896

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -96,7 +96,7 @@ We used the BabyLM 10M (Strict-small) dataset to train the model. It is composed
 | Sequence Length | 128 &rarr; 512 |
 | Batch Size (in tokens) | 16 384 |
 | Learning Rate | 0.007 |
-| Number of Steps | 9914 |
 | Warmup Ratio | 1.6% |
 | Cooldown Ratio | 1.6% |
 | Mask Ratio | 0.3 &rarr; 0.15 |

 | Sequence Length | 128 &rarr; 512 |
 | Batch Size (in tokens) | 16 384 |
 | Learning Rate | 0.007 |
+| Number of Steps | 9 914 |
 | Warmup Ratio | 1.6% |
 | Cooldown Ratio | 1.6% |
 | Mask Ratio | 0.3 &rarr; 0.15 |