Update README.md
Browse files
README.md
CHANGED
@@ -11,10 +11,10 @@ language:
|
|
11 |
|
12 |
# Model Card for OLMo 2 32B
|
13 |
|
14 |
-
We introduce OLMo 2 32B, to the family of 7B and 13B models featuring a 9-point increase in MMLU, among other evaluation improvements, compared to the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model. These gains come from training on [OLMo-mix-
|
15 |
|
16 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
17 |
-
These models are trained on the Dolma dataset. We have released all code, checkpoints, logs, and associated training details on [GitHub](https://github.com/allenai/OLMo).
|
18 |
|
19 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
20 |
|------|--------|---------|-------------|-----------------|----------------|
|
|
|
11 |
|
12 |
# Model Card for OLMo 2 32B
|
13 |
|
14 |
+
We introduce OLMo 2 32B, to the family of 7B and 13B models featuring a 9-point increase in MMLU, among other evaluation improvements, compared to the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model. These gains come from training on [OLMo-mix-0325](https://huggingface.co/datasets/allenai/olmo-mix-1124) and [Dolmino-mix-0325](https://huggingface.co/datasets/allenai/dolmino-mix-1124) datasets and staged training approach.
|
15 |
|
16 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
17 |
+
These models are trained on the Dolma dataset. We have released all code, checkpoints, logs, and associated training details on [GitHub](https://github.com/allenai/OLMo-core).
|
18 |
|
19 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
20 |
|------|--------|---------|-------------|-----------------|----------------|
|