Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ This model was obtained by linearly decaying the learning rate of the [OLMo-2-04
|
|
| 14 |
|
| 15 |
The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".
|
| 16 |
|
| 17 |
-
**Note:** This is the model that is named OLMo-2-1B in the paper. To avoid confusion, it is named differently on Huggingface.
|
| 18 |
|
| 19 |
## Usage
|
| 20 |
|
|
|
|
| 14 |
|
| 15 |
The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".
|
| 16 |
|
| 17 |
+
**Note:** This is the model that is named OLMo-2-1B in the paper. To avoid confusion with the fully trained OLMo-2-1B base model, it is named differently on Huggingface.
|
| 18 |
|
| 19 |
## Usage
|
| 20 |
|