Text Generation
Transformers
Safetensors
English
olmo2
conversational
Inference Endpoints
vwxyzjn commited on
Commit
bf9471a
·
verified ·
1 Parent(s): 63f7e0b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +133 -0
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ base_model:
7
+ - allenai/OLMo-2-1124-7B-DPO
8
+ library_name: transformers
9
+ datasets:
10
+ - allenai/RLVR-GSM
11
+ ---
12
+
13
+ <img alt="OLMo Logo" src="https://huggingface.co/datasets/allenai/blog-images/resolve/main/olmo2/olmo.png" width="242px">
14
+
15
+ # OLMo-2-1124-7B-Instruct
16
+
17
+ OLMo 2 7B Instruct November 2024 is post-trained variant of the [OLMo-2 7B November 2024](https://huggingface.co/allenai/OLMo2-7B-1124) model, which has undergone supervised finetuning on an OLMo-specific variant of the [Tülu 3 dataset](allenai/tulu-3-sft-olmo-2-mixture) and further DPO training on [this dataset](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix), and finally RLVR training using [this data](https://huggingface.co/datasets/allenai/RLVR-GSM).
18
+ Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
19
+ Check out the OLMo 2 paper (forthcoming) or [Tülu 3 paper](https://arxiv.org/abs/2411.15124) for more details!
20
+
21
+ OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
22
+ These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
23
+ The core models released in this batch include the following:
24
+
25
+
26
+ | **Stage** | **OLMo 2 7B** | **OLMo 2 13B** |
27
+ |----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
28
+ | **Base Model** | [allenai/OLMo2-7B-1124](https://huggingface.co/allenai/OLMo2-7B-1124) | [allenai/OLMo-2-13B-1124](https://huggingface.co/allenai/OLMo-2-13B-1124) |
29
+ | **SFT** | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) |
30
+ | **DPO** | [allenai/OLMo-2-1124-7B-DPO](https://huggingface.co/allenai/OLMo-2-1124-7B-DPO) | [allenai/OLMo-2-1124-13B-DPO](https://huggingface.co/allenai/OLMo-2-1124-13B-DPO) |
31
+ | **Final Models (RLVR)** | [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) | [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) |
32
+ | **Reward Model (RM)**| [allenai/OLMo-2-1124-7B-RM](https://huggingface.co/allenai/OLMo-2-1124-7B-RM) | (Same as 7B) |
33
+
34
+
35
+
36
+ ## Model description
37
+
38
+ - **Model type:** A model trained on a mix of publicly available, synthetic and human-created datasets.
39
+ - **Language(s) (NLP):** Primarily English
40
+ - **License:** Apache 2.0
41
+ - **Finetuned from model:** allenai/OLMo-2-7B-1124-DPO
42
+
43
+ ### Model Sources
44
+
45
+ - **Project Page:** https://allenai.org/olmo
46
+ - **Repositories:**
47
+ - Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo
48
+ - Evaluation code: https://github.com/allenai/olmes
49
+ - Further fine-tuning code: https://github.com/allenai/open-instruct
50
+ - **Paper:** Coming soon!
51
+ - **Demo:** https://playground.allenai.org/
52
+
53
+ ## Installation
54
+
55
+ OLMo 2 will be supported in the next version of Transformers, and you need to install it from the main branch using:
56
+ ```bash
57
+ pip install --upgrade git+https://github.com/huggingface/transformers.git
58
+ ```
59
+
60
+ ## Using the model
61
+
62
+ ### Loading with HuggingFace
63
+
64
+ To load the model with HuggingFace, use the following snippet:
65
+ ```
66
+ from transformers import AutoModelForCausalLM
67
+
68
+ olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B-Instruct")
69
+ ```
70
+
71
+ ### Chat template
72
+
73
+ The chat template for our models is formatted as:
74
+ ```
75
+ <|endoftext|><|user|>\nHow are you doing?\n<|assistant|>\nI'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
76
+ ```
77
+ Or with new lines expanded:
78
+ ```
79
+ <|endoftext|><|user|>
80
+ How are you doing?
81
+ <|assistant|>
82
+ I'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
83
+ ```
84
+ It is embedded within the tokenizer as well, for `tokenizer.apply_chat_template`.
85
+
86
+ ### System prompt
87
+
88
+ In Ai2 demos, we use this system prompt by default:
89
+ ```
90
+ You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute for AI.
91
+ ```
92
+ The model has not been trained with a specific system prompt in mind.
93
+
94
+ ### Bias, Risks, and Limitations
95
+
96
+ The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
97
+ See the Falcon 180B model card for an example of this.
98
+
99
+
100
+ ## Performance
101
+
102
+ | Model | Average | AlpacaEval | BBH | DROP | GSM8k | IFEval | MATH | MMLU | Safety | PopQA | TruthQA |
103
+ |-------|---------|------------|-----|------|--------|---------|------|-------|---------|-------|---------|
104
+ | **Open weights models** |
105
+ | Gemma-2-9B-it | 51.9 | 43.7 | 2.5 | 58.8 | 79.7 | 69.9 | 29.8 | 69.1 | 75.5 | 28.3 | 61.4 |
106
+ | Ministral-8B-Instruct | 52.1 | 31.4 | 56.2 | 56.2 | 80.0 | 56.4 | 40.0 | 68.5 | 56.2 | 20.2 | 55.5 |
107
+ | Mistral-Nemo-Instruct-2407 | 50.9 | 45.8 | 54.6 | 23.6 | 81.4 | 64.5 | 31.9 | 70.0 | 52.7 | 26.9 | 57.7 |
108
+ | Qwen-2.5-7B-Instruct | 57.1 | 29.7 | 25.3 | 54.4 | 83.8 | 74.7 | 69.9 | 76.6 | 75.0 | 18.1 | 63.1 |
109
+ | Llama-3.1-8B-Instruct | 58.9 | 25.8 | 69.7 | 61.7 | 83.4 | 80.6 | 42.5 | 71.3 | 70.2 | 28.4 | 55.1 |
110
+ | Tülu 3 8B | 60.4 | 34.0 | 66.0 | 62.6 | 87.6 | 82.4 | 43.7 | 68.2 | 75.4 | 29.1 | 55.0 |
111
+ | Qwen-2.5-14B-Instruct | 60.8 | 34.6 | 34.0 | 50.5 | 83.9 | 82.4 | 70.6 | 81.1 | 79.3 | 21.1 | 70.8 |
112
+ | **Fully open models** |
113
+ | OLMo-7B-Instruct | 28.2 | 5.2 | 35.3 | 30.7 | 14.3 | 32.2 | 2.1 | 46.3 | 54.0 | 17.1 | 44.5 |
114
+ | OLMo-7B-0424-Instruct | 33.1 | 8.5 | 34.4 | 47.9 | 23.2 | 39.2 | 5.2 | 48.9 | 49.3 | 18.9 | 55.2 |
115
+ | OLMoE-1B-7B-0924-Instruct | 35.5 | 8.5 | 37.2 | 34.3 | 47.2 | 46.2 | 8.4 | 51.6 | 51.6 | 20.6 | 49.1 |
116
+ | MAP-Neo-7B-Instruct | 42.9 | 17.6 | 26.4 | 48.2 | 69.4 | 35.9 | 31.5 | 56.5 | 73.7 | 18.4 | 51.6 |
117
+ | *OLMo-2-7B-SFT* | 50.2 | 10.2 | 49.7 | 59.6 | 74.6 | 66.9 | 25.3 | 61.1 | 82.1 | 23.6 | 48.6 |
118
+ | *OLMo-2-7B-DPO* | 54.2 | 27.9 | 46.7 | 60.2 | 82.6 | 73.0 | 30.3 | 60.8 | 81.0 | 23.5 | 56.0 |
119
+ | *OLMo-2-13B-SFT* | 55.3 | 11.5 | 59.6 | 71.3 | 76.3 | 68.6 | 29.5 | 68.0 | 82.3 | 29.4 | 57.1 |
120
+ | *OLMo-2-13B-DPO* | 60.6 | 38.3 | 57.9 | 71.5 | 82.3 | 80.2 | 35.2 | 67.9 | 79.7 | 29.0 | 63.9 |
121
+ | **OLMo-2-7B-1124–Instruct** | 54.8 | 29.1 | 46.6 | 60.5 | 85.1 | 72.3 | 32.5 | 61.3 | 80.6 | 23.2 | 56.5 |
122
+ | **OLMo-2-13B-1124-Instruct** | 62.0 | 39.5 | 58.8 | 71.5 | 87.4 | 82.6 | 39.2 | 68.5 | 79.1 | 28.8 | 64.3 |
123
+
124
+ ## License and use
125
+
126
+ OLMo 2 is licensed under the Apache 2.0 license.
127
+ OLMo 2 is intended for research and educational use.
128
+ For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
129
+ This model has been fine-tuned using a dataset mix with outputs generated from third party models and are subject to additional terms: [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
130
+
131
+ ## Citation
132
+
133
+ A technical manuscript is forthcoming!