sbordt
/

OLMo-2-1B-Decayed-Early

Model card Files Files and versions

sbordt commited on Sep 25

Commit

94b8477

·

verified ·

1 Parent(s): 57f863a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ This model was obtained by linearly decaying the learning rate of the [OLMo-2-04
 The model is described in the paper  "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".
-**Note:** This is the model that is named OLMo-2-1B in the paper. To avoid confusion, it is named differently on Huggingface.
 ## Usage

 The model is described in the paper  "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".
+**Note:** This is the model that is named OLMo-2-1B in the paper. To avoid confusion with the fully trained OLMo-2-1B base model, it is named differently on Huggingface.
 ## Usage