MisterAI Jeronymous commited on
Commit
ff9caf2
·
verified ·
0 Parent(s):

Duplicate from OpenLLM-France/Lucie-7B

Browse files

Co-authored-by: Jérôme Louradour <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ figures/pie_dataset_composition_training.png filter=lfs diff=lfs merge=lfs -text
LICENSE.md ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Apache License 2.0
3
+ spdx-id: Apache-2.0
4
+ redirect_from: /licenses/apache/
5
+ featured: true
6
+ hidden: false
7
+
8
+ description: A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
9
+
10
+ how: Create a text file (typically named LICENSE or LICENSE.txt) in the root of your source code and copy the text of the license into the file.
11
+
12
+ note: The Apache Software Foundation <a href="https://apache.org/foundation/license-faq.html#Apply-My-Software">recommends</a> taking the additional step of adding a boilerplate notice to the header of each source file. You can find the notice in the appendix at the very end of the license text.
13
+
14
+ using:
15
+ Kubernetes: https://github.com/kubernetes/kubernetes/blob/master/LICENSE
16
+ PDF.js: https://github.com/mozilla/pdf.js/blob/master/LICENSE
17
+ Swift: https://github.com/apple/swift/blob/main/LICENSE.txt
18
+
19
+ permissions:
20
+ - commercial-use
21
+ - modifications
22
+ - distribution
23
+ - patent-use
24
+ - private-use
25
+
26
+ conditions:
27
+ - include-copyright
28
+ - document-changes
29
+
30
+ limitations:
31
+ - trademark-use
32
+ - liability
33
+ - warranty
34
+
35
+ ---
36
+
37
+ Apache License
38
+ Version 2.0, January 2004
39
+ http://www.apache.org/licenses/
40
+
41
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
42
+
43
+ 1. Definitions.
44
+
45
+ "License" shall mean the terms and conditions for use, reproduction,
46
+ and distribution as defined by Sections 1 through 9 of this document.
47
+
48
+ "Licensor" shall mean the copyright owner or entity authorized by
49
+ the copyright owner that is granting the License.
50
+
51
+ "Legal Entity" shall mean the union of the acting entity and all
52
+ other entities that control, are controlled by, or are under common
53
+ control with that entity. For the purposes of this definition,
54
+ "control" means (i) the power, direct or indirect, to cause the
55
+ direction or management of such entity, whether by contract or
56
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
57
+ outstanding shares, or (iii) beneficial ownership of such entity.
58
+
59
+ "You" (or "Your") shall mean an individual or Legal Entity
60
+ exercising permissions granted by this License.
61
+
62
+ "Source" form shall mean the preferred form for making modifications,
63
+ including but not limited to software source code, documentation
64
+ source, and configuration files.
65
+
66
+ "Object" form shall mean any form resulting from mechanical
67
+ transformation or translation of a Source form, including but
68
+ not limited to compiled object code, generated documentation,
69
+ and conversions to other media types.
70
+
71
+ "Work" shall mean the work of authorship, whether in Source or
72
+ Object form, made available under the License, as indicated by a
73
+ copyright notice that is included in or attached to the work
74
+ (an example is provided in the Appendix below).
75
+
76
+ "Derivative Works" shall mean any work, whether in Source or Object
77
+ form, that is based on (or derived from) the Work and for which the
78
+ editorial revisions, annotations, elaborations, or other modifications
79
+ represent, as a whole, an original work of authorship. For the purposes
80
+ of this License, Derivative Works shall not include works that remain
81
+ separable from, or merely link (or bind by name) to the interfaces of,
82
+ the Work and Derivative Works thereof.
83
+
84
+ "Contribution" shall mean any work of authorship, including
85
+ the original version of the Work and any modifications or additions
86
+ to that Work or Derivative Works thereof, that is intentionally
87
+ submitted to Licensor for inclusion in the Work by the copyright owner
88
+ or by an individual or Legal Entity authorized to submit on behalf of
89
+ the copyright owner. For the purposes of this definition, "submitted"
90
+ means any form of electronic, verbal, or written communication sent
91
+ to the Licensor or its representatives, including but not limited to
92
+ communication on electronic mailing lists, source code control systems,
93
+ and issue tracking systems that are managed by, or on behalf of, the
94
+ Licensor for the purpose of discussing and improving the Work, but
95
+ excluding communication that is conspicuously marked or otherwise
96
+ designated in writing by the copyright owner as "Not a Contribution."
97
+
98
+ "Contributor" shall mean Licensor and any individual or Legal Entity
99
+ on behalf of whom a Contribution has been received by Licensor and
100
+ subsequently incorporated within the Work.
101
+
102
+ 2. Grant of Copyright License. Subject to the terms and conditions of
103
+ this License, each Contributor hereby grants to You a perpetual,
104
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
105
+ copyright license to reproduce, prepare Derivative Works of,
106
+ publicly display, publicly perform, sublicense, and distribute the
107
+ Work and such Derivative Works in Source or Object form.
108
+
109
+ 3. Grant of Patent License. Subject to the terms and conditions of
110
+ this License, each Contributor hereby grants to You a perpetual,
111
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
112
+ (except as stated in this section) patent license to make, have made,
113
+ use, offer to sell, sell, import, and otherwise transfer the Work,
114
+ where such license applies only to those patent claims licensable
115
+ by such Contributor that are necessarily infringed by their
116
+ Contribution(s) alone or by combination of their Contribution(s)
117
+ with the Work to which such Contribution(s) was submitted. If You
118
+ institute patent litigation against any entity (including a
119
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
120
+ or a Contribution incorporated within the Work constitutes direct
121
+ or contributory patent infringement, then any patent licenses
122
+ granted to You under this License for that Work shall terminate
123
+ as of the date such litigation is filed.
124
+
125
+ 4. Redistribution. You may reproduce and distribute copies of the
126
+ Work or Derivative Works thereof in any medium, with or without
127
+ modifications, and in Source or Object form, provided that You
128
+ meet the following conditions:
129
+
130
+ (a) You must give any other recipients of the Work or
131
+ Derivative Works a copy of this License; and
132
+
133
+ (b) You must cause any modified files to carry prominent notices
134
+ stating that You changed the files; and
135
+
136
+ (c) You must retain, in the Source form of any Derivative Works
137
+ that You distribute, all copyright, patent, trademark, and
138
+ attribution notices from the Source form of the Work,
139
+ excluding those notices that do not pertain to any part of
140
+ the Derivative Works; and
141
+
142
+ (d) If the Work includes a "NOTICE" text file as part of its
143
+ distribution, then any Derivative Works that You distribute must
144
+ include a readable copy of the attribution notices contained
145
+ within such NOTICE file, excluding those notices that do not
146
+ pertain to any part of the Derivative Works, in at least one
147
+ of the following places: within a NOTICE text file distributed
148
+ as part of the Derivative Works; within the Source form or
149
+ documentation, if provided along with the Derivative Works; or,
150
+ within a display generated by the Derivative Works, if and
151
+ wherever such third-party notices normally appear. The contents
152
+ of the NOTICE file are for informational purposes only and
153
+ do not modify the License. You may add Your own attribution
154
+ notices within Derivative Works that You distribute, alongside
155
+ or as an addendum to the NOTICE text from the Work, provided
156
+ that such additional attribution notices cannot be construed
157
+ as modifying the License.
158
+
159
+ You may add Your own copyright statement to Your modifications and
160
+ may provide additional or different license terms and conditions
161
+ for use, reproduction, or distribution of Your modifications, or
162
+ for any such Derivative Works as a whole, provided Your use,
163
+ reproduction, and distribution of the Work otherwise complies with
164
+ the conditions stated in this License.
165
+
166
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
167
+ any Contribution intentionally submitted for inclusion in the Work
168
+ by You to the Licensor shall be under the terms and conditions of
169
+ this License, without any additional terms or conditions.
170
+ Notwithstanding the above, nothing herein shall supersede or modify
171
+ the terms of any separate license agreement you may have executed
172
+ with Licensor regarding such Contributions.
173
+
174
+ 6. Trademarks. This License does not grant permission to use the trade
175
+ names, trademarks, service marks, or product names of the Licensor,
176
+ except as required for reasonable and customary use in describing the
177
+ origin of the Work and reproducing the content of the NOTICE file.
178
+
179
+ 7. Disclaimer of Warranty. Unless required by applicable law or
180
+ agreed to in writing, Licensor provides the Work (and each
181
+ Contributor provides its Contributions) on an "AS IS" BASIS,
182
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
183
+ implied, including, without limitation, any warranties or conditions
184
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
185
+ PARTICULAR PURPOSE. You are solely responsible for determining the
186
+ appropriateness of using or redistributing the Work and assume any
187
+ risks associated with Your exercise of permissions under this License.
188
+
189
+ 8. Limitation of Liability. In no event and under no legal theory,
190
+ whether in tort (including negligence), contract, or otherwise,
191
+ unless required by applicable law (such as deliberate and grossly
192
+ negligent acts) or agreed to in writing, shall any Contributor be
193
+ liable to You for damages, including any direct, indirect, special,
194
+ incidental, or consequential damages of any character arising as a
195
+ result of this License or out of the use or inability to use the
196
+ Work (including but not limited to damages for loss of goodwill,
197
+ work stoppage, computer failure or malfunction, or any and all
198
+ other commercial damages or losses), even if such Contributor
199
+ has been advised of the possibility of such damages.
200
+
201
+ 9. Accepting Warranty or Additional Liability. While redistributing
202
+ the Work or Derivative Works thereof, You may choose to offer,
203
+ and charge a fee for, acceptance of support, warranty, indemnity,
204
+ or other liability obligations and/or rights consistent with this
205
+ License. However, in accepting such obligations, You may act only
206
+ on Your own behalf and on Your sole responsibility, not on behalf
207
+ of any other Contributor, and only if You agree to indemnify,
208
+ defend, and hold each Contributor harmless for any liability
209
+ incurred by, or claims asserted against, such Contributor by reason
210
+ of your accepting any such warranty or additional liability.
211
+
212
+ END OF TERMS AND CONDITIONS
213
+
214
+ APPENDIX: How to apply the Apache License to your work.
215
+
216
+ To apply the Apache License to your work, attach the following
217
+ boilerplate notice, with the fields enclosed by brackets "[]"
218
+ replaced with your own identifying information. (Don't include
219
+ the brackets!) The text should be enclosed in the appropriate
220
+ comment syntax for the file format. We also recommend that a
221
+ file or class name and description of purpose be included on the
222
+ same "printed page" as the copyright notice for easier
223
+ identification within third-party archives.
224
+
225
+ Copyright [yyyy] [name of copyright owner]
226
+
227
+ Licensed under the Apache License, Version 2.0 (the "License");
228
+ you may not use this file except in compliance with the License.
229
+ You may obtain a copy of the License at
230
+
231
+ http://www.apache.org/licenses/LICENSE-2.0
232
+
233
+ Unless required by applicable law or agreed to in writing, software
234
+ distributed under the License is distributed on an "AS IS" BASIS,
235
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
236
+ See the License for the specific language governing permissions and
237
+ limitations under the License.
README.md ADDED
@@ -0,0 +1,363 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ language:
5
+ - fr
6
+ - en
7
+ - it
8
+ - de
9
+ - es
10
+ tags:
11
+ - pretrained
12
+ - llama-3
13
+ - openllm-france
14
+ datasets:
15
+ - OpenLLM-France/Lucie-Training-Dataset
16
+ widget:
17
+ - text: |-
18
+ Quelle est la capitale de l'Espagne ? Madrid.
19
+ Quelle est la capitale de la France ?
20
+ example_title: Capital cities in French
21
+ group: 1-shot Question Answering
22
+ training_progress:
23
+ num_steps: 756291
24
+ num_tokens: 3131736326144
25
+ context_length: 32000
26
+ ---
27
+
28
+ # Model Card for Lucie-7B
29
+
30
+ <!-- inspired from the following template:
31
+ https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1
32
+ -->
33
+
34
+ * [Model Description](#model-description)
35
+ <!-- * [Uses](#uses) -->
36
+ * [Example Code in Python](#example-code-in-python)
37
+ * [Load the model](#load-the-model)
38
+ * [Sentence completion](#sentence-completion)
39
+ * [Load a checkpoint](#load-a-checkpoint)
40
+ * [Training Details](#training-details)
41
+ * [Training Data](#training-data)
42
+ * [Training Procedure](#training-procedure)
43
+ * [Neural Network Architecture](#neural-network-architecture)
44
+ * [Training Hyperparameters](#training-hyperparameters)
45
+ 1. [Main Pre-training](#1-main-pre-training)
46
+ 2. [Context Length Extension](#2-context-extension)
47
+ 3. [Annealing](#3-annealing)
48
+ * [Training Logs and Learning Curves](#training-logs-and-learning-curves)
49
+ <!-- * [Evaluation](#evaluation) -->
50
+ * [Disclaimer](#disclaimer)
51
+ * [Citation](#citation)
52
+ * [Acknowledgements](#acknowledgements)
53
+ * [Contact](#contact)
54
+
55
+ ## Model Description
56
+
57
+ Lucie-7B is a pretrained 7B parameter causal language model built by [LINAGORA](https://labs.linagora.com/) and [OpenLLM-France](https://github.com/OpenLLM-France).
58
+
59
+ Lucie-7B was trained on 3 trillion tokens of multilingual data, including
60
+ English (33.2%),
61
+ French (32.4%),
62
+ German (6.9%),
63
+ Spanish (6.6%),
64
+ Italian (3.8%),
65
+ and parallel data from those languages (2.5%),
66
+ as well as several programming languages (14.7%).
67
+
68
+ ## Example Code in Python
69
+
70
+ ### Load the model
71
+
72
+ Load the model (quantized version on GPU if possible, for efficient inference):
73
+ ```python
74
+ import transformers
75
+
76
+ model_name = "OpenLLM-France/Lucie-7B"
77
+
78
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
79
+ model = transformers.AutoModelForCausalLM.from_pretrained(model_name,
80
+ device_map="auto",
81
+ load_in_4bit=True # For efficient inference, if quantization is supported by the GPU card
82
+ )
83
+ ```
84
+ ### Sentence completion
85
+
86
+ Wrap the model in a text generation pipeline, and specify some generation parameters:
87
+ ```
88
+ pipeline = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer)
89
+
90
+ generation_kwargs = dict(
91
+ num_return_sequences=1, # Number of variants to generate.
92
+ return_full_text= False, # Do not include the prompt in the generated text.
93
+ do_sample=True,
94
+ temperature=1.0, top_p=1, top_k=None, # Sampling parameters.
95
+ max_new_tokens=200, # Maximum length for the output text (in number of tokens).
96
+ )
97
+ ```
98
+
99
+ Try 1-shot question answering:
100
+ ```python
101
+ prompt = """\
102
+ Quelle est la capitale de l'Espagne ? Madrid\n\
103
+ Quelle est la capitale de la France ?\
104
+ """
105
+ completions = pipeline(prompt, **generation_kwargs)
106
+ for completion in completions:
107
+ print(prompt + " […]" + completion['generated_text'])
108
+ ```
109
+ This will print something like:
110
+ ```
111
+ Quelle est la capitale de l'Espagne ? Madrid
112
+ Quelle est la capitale de la France ? […] Paris
113
+ Quelle est la capitale de l'Italie? Rome
114
+ Quelle est la capitale de la Grande-Bretagne? Londres
115
+ Quelle est la capitale de la Suisse? Berne
116
+ Quelle est la capitale du Portugal? Lisbonne
117
+ Quelle est la capitale de l'Algérie? Alger
118
+ ...
119
+ ```
120
+
121
+ If running on GPU (`cuda` device), you will need at least 6GB of VRAM to run inference using 4bit quantization (16GB of VRAM without 4bit quantization).
122
+
123
+ ### Load a checkpoint
124
+
125
+ Checkpoints at several training steps are available under revision tags,
126
+ every 5000 steps during the first 25000 steps, and then every 25000 steps.
127
+
128
+ Intermediate checkpoints can be loaded using the `revision` parameter:
129
+ ```python
130
+ model = transformers.AutoModelForCausalLM.from_pretrained(model_name,
131
+ revision="step0753851",
132
+ ...
133
+ )
134
+ ```
135
+ where `revision` can be one of:
136
+ * "[`step0005000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0005000)", "[`step0010000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0010000)", "[`step0015000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0015000)", "[`step0020000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0020000)": every 5000 steps for the first pre-training steps (with a context length of 4096).
137
+ * "[`step0025000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0025000)", "[`step0050000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0050000)", "[`step0075000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0075000)", "[`step0100000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0100000)", ..., "[`step0750000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0750000)": every 25000 steps from 25k to 750k steps.
138
+ * "[`step0753851`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0753851)": last pre-training step before context length extension and annealing.
139
+ * "[`extension_step0000250`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000250)", "[`extension_step0000500`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000500)", "[`extension_step0000750`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000750)", "[`extension_step0001000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0001000)", "[`extension_step0001220`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0001220)": several checkpoints during context length extension (with a context length of 32000).
140
+
141
+ ## Training Details
142
+
143
+ ### Training Data
144
+
145
+ The training dataset used for the pretraining of Lucie-7B is available
146
+ at [OpenLLM-France/Lucie-Training-Dataset](https://huggingface.co/datasets/OpenLLM-France/Lucie-Training-Dataset).
147
+ <!-- and described in ["The Lucie Training Dataset" (2024/12)](https://arxiv.org/abs/xxxx.xxxxx). -->
148
+
149
+ The initial composition of the training data is as follows:
150
+
151
+ ![Initial Data Composition](figures/pie_dataset_composition.png)
152
+
153
+ Some of the data was upsampled to balance the training data distribution yielding the following composition for training:
154
+
155
+ ![Training Data Composition](figures/pie_dataset_composition_training.png)
156
+
157
+ ### Training Procedure
158
+
159
+ Lucie-7B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token).
160
+
161
+ It was pre-trained on 512 H100 80GB GPUs for about 550\,000 GPU hours on the [Jean Zay supercomputer](http://www.idris.fr/eng/jean-zay/jean-zay-presentation-eng.html).
162
+
163
+ The training code is available at [https://github.com/OpenLLM-France/Lucie-Training](https://github.com/OpenLLM-France/Lucie-Training).
164
+ It is based on [this fork of Megatron-DeepSpeed](https://github.com/OpenLLM-France/Megatron-DeepSpeed).
165
+
166
+ Optimizer checkpoints are available at [OpenLLM-France/Lucie-7B-optimizer-states](https://huggingface.co/OpenLLM-France/Lucie-7B-optimizer-states).
167
+
168
+ #### Neural Network Architecture
169
+
170
+ Lucie-7B has the same neural network architecture as [Llama3.1](https://huggingface.co/meta-llama/Llama-3.1-8B).
171
+ It has exactly 6 706 958 336 free parameters,
172
+ with the following hyperparameters:
173
+ | **Hyperparameter** | **Value** |
174
+ |---------------------------|---------|
175
+ | Vocabulary size (\# tokens)| 65 024 |
176
+ | \# transformer blocks | 32 |
177
+ | \# attention heads | 32 |
178
+ | \# key-value heads | 8 |
179
+ | Hidden size | 4 096 |
180
+ | Feed-Forward hidden size | 12 288 |
181
+ | Activation | `silu` |
182
+ | RMS norm epsilon | 1e-5 |
183
+
184
+ The "theta" parameter of Rotary Positional Embedding (RoPE) was increased during the training process. Its values are indicated in the tables with training hyperparameters below.
185
+
186
+ #### Training Hyperparameters
187
+
188
+ The training consisted of three main phases:
189
+ 1. Main pre-training on 3.1T tokens, with a context length of 4096,
190
+ 2. Context extension on 5B tokens, with a context length of 32000,
191
+ 3. Annealing on 5B tokens of high quality data composed of a mixture of new data and data seen during training.
192
+ <!-- perhaps cite the dataset for annealing -->
193
+
194
+ The details of each phase are given below.
195
+
196
+ ##### 1. Main Pre-training
197
+
198
+ Training hyperparameters in torch/Megatron-DeepSpeed were as follows:
199
+ | **Hyperparameter** | **Value** |
200
+ |------------------------|------------|
201
+ | Total \# samples| 762 144 586 (3.1T tokens) |
202
+ | Total \# steps | 753 851 |
203
+ | RoPE theta | 500 000 |
204
+ | Context length | 4 096 |
205
+ | Initial Batch size | 256 |
206
+ | Final Batch size | 1 024 |
207
+ | Batch size rampup | by steps of 64 over 10M samples |
208
+ | Learning rate schedule | warmup (2M samples) + cosine annealing |
209
+ | Maximum Learning rate | 3e-4 |
210
+ | Final Learning rate | 3e-5 |
211
+ | Weight decay | 0.1 |
212
+ | Dropout | _ |
213
+ | Gradient clipping | 1 |
214
+ | Initializer range | 0.009 |
215
+ | Optimizer | `AdamW` (β₁=0.9, β₂=0.95, ε=1e-5) |
216
+ | Precision | `bfloat16` |
217
+ | Tensor Parallelism (with 512 GPUs) | 4 |
218
+ | Pipeline Parallelism (with 512 GPUs) | 4 |
219
+ | Data Parallelism (with 512 GPUs) | 32 |
220
+
221
+ #### 2. Context Length Extension
222
+
223
+ Training hyperparameters are the same as above, with the following changes:
224
+ | **Hyperparameter** | **Value** |
225
+ |------------------------|------------|
226
+ | Total \# samples| 156 250 (5B tokens) |
227
+ | Total \# steps | 1 220 |
228
+ | RoPE theta | 20 000 000 |
229
+ | Context length | 32 000 |
230
+ | Batch size | 128 |
231
+ | Learning rate | 2e-5 |
232
+ | Learning rate schedule | constant |
233
+ | Tensor Parallelism (with 128 GPUs) | 4 |
234
+ | Pipeline Parallelism (with 128 GPUs) | 4 |
235
+ | Data Parallelism (with 128 GPUs) | 8 |
236
+
237
+ #### 3. Annealing
238
+
239
+ Training hyperparameters are the same as for context length extension, with the following changes:
240
+ | **Hyperparameter** | **Value** |
241
+ |------------------------|------------|
242
+ | Total \# samples| 156 250 (5B tokens) |
243
+ | Total \# steps | 1 220 |
244
+ | Learning rate schedule | linear annealing |
245
+ | Maximum Learning rate | 3e-5 |
246
+ | Final Learning rate | 0 |
247
+
248
+ ### Training Logs and Learning Curves
249
+
250
+ #### Training loss
251
+
252
+ Training logs can be found in Tensorboard format in:
253
+ * [`metadata/training_logs/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs)
254
+ <br> ├── [`1_pretraining.zip`](metadata/training_logs/1_pretraining.zip) training logs for the first pre-training phases,
255
+ in a zip file. Each file in the zip corresponds to a job of at most 20H of training (parallelized over 512 GPUs).
256
+ <br> ├── [`2_extension/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs/2_extension) folder containing the training log <br> └── [`3_annealing/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs/3_annealing) folder containing the training log for the annealing phase, which also took around 13H of training (parallelized over 128 GPUs).
257
+
258
+ The convergence curves of the three pre-training phases are the following:
259
+
260
+ ![figures/convergence-curve-pretraining.png](figures/convergence-curve-pretraining.png)
261
+
262
+ Data corresponding to these plots were extracted from tensorboard logs and are available in the following CSV files:
263
+ * [`metadata/training_logs/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs)
264
+ <br> ├── [`1_pretraining.csv`](metadata/training_logs/1_pretraining.csv)
265
+ <br> ├── [`2_extension.csv`](metadata/training_logs/2_extension.csv)
266
+ <br> └── [`3_annealing.csv`](metadata/training_logs/3_annealing.csv)
267
+
268
+ #### Evaluations
269
+
270
+ Multiple evaluations were conducted during Lucie-7B's training to assess its performance on standard benchmarks,
271
+ primarily in French and English, as well as in Spanish, German, and Italian.
272
+
273
+ Evaluation results on benchmark datasets of checkpoints of Lucie-7B throughout the training process are available at
274
+ [metadata/evaluation_learning_curve_lucie.csv](metadata/evaluation_learning_curve_lucie.csv).
275
+ Evaluation results of baseline models on the same benchmark datasets are available at
276
+ [metadata/evaluation_baselines.csv](metadata/evaluation_baselines.csv).
277
+
278
+ Main results are summarized in the following figures:
279
+
280
+ ### French
281
+ ![figures/learning-curve-evaluation-french-bench.png](figures/learning-curve-evaluation-french-bench.png)
282
+
283
+ ### English
284
+ ![figures/learning-curve-evaluation-benchmarks-in-english.png](figures/learning-curve-evaluation-benchmarks-in-english.png)
285
+
286
+ ### other
287
+ ![figures/learning-curve-evaluation-multilingual-arc-benchmark.png](figures/learning-curve-evaluation-multilingual-arc-benchmark.png)
288
+
289
+ ### Needle in a Haystack
290
+
291
+ #### Pretraining
292
+ ![figures/needle-in-a-haystack/Lucie-7B-main.png](figures/needle-in-a-haystack/Lucie-7B-main.png)
293
+
294
+ #### Context Length Extension
295
+ ![figures/needle-in-a-haystack/Lucie-7B-extension.png](figures/needle-in-a-haystack/Lucie-7B-extension.png)
296
+
297
+ #### Annealing
298
+ ![figures/needle-in-a-haystack/Lucie-7B-annealing.png](figures/needle-in-a-haystack/Lucie-7B-annealing.png)
299
+
300
+
301
+ ## Disclaimer
302
+
303
+ Lucie-7B is a language model trained solely to predict the most probable next word in a sequence. Despite efforts to filter the [Lucie Training Dataset](https://huggingface.co/datasets/OpenLLM-France/Lucie-Training-Dataset), it is possible that Lucie-7B encountered strings containing toxic or offensive language during its training and as a result, it may generate strings of similar quality. To limit such behavior, it is advised to fine-tune Lucie-7B through instruction and/or preference tuning (DPO, RLHF, etc.).
304
+
305
+ ## Citation
306
+
307
+ When using the Lucie-7B model, please cite the following paper:
308
+
309
+ ✍ Olivier Gouvert, Julie Hunter, Jérôme Louradour,
310
+ Evan Dufraisse, Yaya Sy, Pierre-Carl Langlais, Anastasia Stasenko,
311
+ Laura Rivière, Christophe Cerisara, Jean-Pierre Lorré (2025)
312
+ Lucie-7B LLM and its training dataset
313
+ ```bibtex
314
+ @misc{openllm2023claire,
315
+ title={The Lucie-7B LLM and the Lucie Training Dataset:
316
+ open resources for multilingual language generation},
317
+ author={Olivier Gouvert and Julie Hunter and Jérôme Louradour and Evan Dufraisse and Yaya Sy and Pierre-Carl Langlais and Anastasia Stasenko and Laura Rivière and Christophe Cerisara and Jean-Pierre Lorré},
318
+ year={2025},
319
+ archivePrefix={arXiv},
320
+ primaryClass={cs.CL}
321
+ }
322
+ ```
323
+
324
+
325
+ ## Acknowledgements
326
+
327
+ This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444). We gratefully acknowledge support from GENCI and IDRIS and from Pierre-François Lavallée (IDRIS) and Stephane Requena (GENCI) in particular.
328
+
329
+ Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
330
+ Agustin Martin Picard (IRT),
331
+ Thibaut Boissin (IRT),
332
+ Christophe Cerisara (LORIA),
333
+ Evan Dufraisse (CEA),
334
+ Julie Hunter (LINAGORA),
335
+ Jean-Pierre Lorré (LINAGORA),
336
+ Jérôme Louradour (LINAGORA),
337
+ Lucas Hervier (IRT),
338
+ Michel-Marie Maudet (LINAGORA),
339
+ Olivier Gouvert (LINAGORA), and
340
+ Yaya Sy (LORIA).
341
+
342
+ We thank
343
+ Anastasia Stasenko (OpSci/Pleias),
344
+ Clément Bénesse (Opsci),
345
+ Guokan Shang (MBZUAI),
346
+ Ismaïl Harrando (LINAGORA),
347
+ Joël Gombin (Opsci),
348
+ Jordan Ricker (Opsci),
349
+ Julien Tourille (EDF),
350
+ Manuel Faysse (ILLUIN Technology),
351
+ Olivier Ferret (CEA),
352
+ Pierre-Carl Langlais (OpSci/Pleias),
353
+ and
354
+ Rachel Bawden (INRIA),
355
+ for their helpful input.
356
+
357
+ We also thank the support teams from IDRIS, in particular Myriam Peyrounette and Hatim Bourfoune, and from Hugging Face, in particular Thomas Wolf, Guilherme Penedo, Elie Bakouch, Haojun Zhao, and Lucain Pouget for their technical guidance.
358
+
359
+ Finally, we thank the entire OpenLLM-France community, whose members have helped in diverse ways.
360
+
361
+ ## Contact
362
+
363
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 1,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 12288,
13
+ "max_position_embeddings": 32000,
14
+ "model_type": "llama",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "num_key_value_heads": 8,
18
+ "pretraining_tp": 1,
19
+ "rms_norm_eps": 1e-05,
20
+ "rope_scaling": null,
21
+ "rope_theta": 20000000.0,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "bfloat16",
24
+ "transformers_version": "4.36.1",
25
+ "use_cache": true,
26
+ "vocab_size": 65024,
27
+ "training_steps": 756291,
28
+ "training_tokens": 3131736326144
29
+ }
figures/convergence-curve-pretraining.png ADDED
figures/learning-curve-evaluation-benchmarks-in-english.png ADDED
figures/learning-curve-evaluation-french-bench.png ADDED
figures/learning-curve-evaluation-multilingual-arc-benchmark.png ADDED
figures/needle-in-a-haystack/Lucie-7B-annealing.png ADDED
figures/needle-in-a-haystack/Lucie-7B-extension.png ADDED
figures/needle-in-a-haystack/Lucie-7B-main.png ADDED
figures/pie_dataset_composition.png ADDED
figures/pie_dataset_composition_training.png ADDED

Git LFS Details

  • SHA256: 1fb3cc91f0c38ebc8aecba19e8d8402d9eea1f90bce4281909a942a1cd371dc6
  • Pointer size: 131 Bytes
  • Size of remote file: 267 kB
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "max_length": 32000,
5
+ "do_sample": true,
6
+ "temperature": 0.6,
7
+ "transformers_version": "4.36.1"
8
+ }
metadata/training_logs/1_pretraining.csv ADDED
@@ -0,0 +1,387 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
2
+ 1865,1955594240,5.540690021514893,0.7975208023282855,408.33065079208217,7.161599933169782e-05
3
+ 3790,4114087936,3.00810284614563,1.6655786928923269,852.7762907608713,0.00015066239575389773
4
+ 5313,6110314496,2.6936181354522706,2.419871313692319,1238.9741126104673,0.0002237664011772722
5
+ 7402,9252634624,2.525974779129028,3.525297917627946,1804.9525338255085,0.000299999926937744
6
+ 9616,13150715904,2.3369024181365967,4.839082616515136,2477.6102996557497,0.00029999829712323844
7
+ 11655,17390895104,2.276480207443237,6.211945974481024,3180.5163389342842,0.00029999419348314404
8
+ 13968,23110877184,2.2075621032714845,7.9962738511337434,4094.0922117804766,0.00029998470563441515
9
+ 15910,28752740352,2.1356527423858642,9.694892240850036,4963.7848273152185,0.0002999709395226091
10
+ 17655,34566045696,2.187712240219116,11.403844424466975,5838.768345327091,0.000299952196655795
11
+ 19561,41906601984,2.1478218460083007,13.512215933090056,6918.254557742109,0.0002999218995682895
12
+ 21599,50454593536,2.108550395965576,15.963069546280945,8173.091607695844,0.00029987728339619935
13
+ 22929,56033017856,2.08190354347229,17.555688533989233,8988.512529402487,0.0002998427371494472
14
+ 24879,64211910656,2.063254547119141,20.013122328626352,10246.718632256692,0.00029978438396938145
15
+ 26435,70738247680,2.0415384721755983,22.095803890509558,11313.051591940894,0.00029973124037496746
16
+ 28408,79013609472,2.0200176572799684,24.742381052788318,12668.099099027619,0.0002996554540004581
17
+ 30712,88677285888,2.009900689125061,27.834729681593487,14251.381596975865,0.00029955507488921285
18
+ 32719,97095254016,1.997526035308838,30.529593802309225,15631.152026782323,0.00029945719870738685
19
+ 34620,105068625920,1.9822303581237792,33.088940144262565,16941.537353862434,0.00029935556813143194
20
+ 37346,116502298624,1.9698213863372802,36.75907196609288,18820.644846639556,0.0002991946239490062
21
+ 39233,124416950272,1.9548299026489258,39.309065435078146,20126.24150276001,0.00029907276621088386
22
+ 40846,131182362624,1.951894497871399,41.485595499098174,21240.624895538265,0.0002989618224091828
23
+ 42956,140032344064,1.9350375270843505,44.33258508443755,22698.283563232024,0.0002988072519656271
24
+ 44939,148349648896,1.9308405590057374,47.01033564593735,24069.291850719925,0.00029865227406844497
25
+ 46768,156021030912,1.9222344255447388,49.47616130587861,25331.79458860985,0.00029850099235773087
26
+ 48882,164887789568,1.9197535276412965,52.35843906614072,26807.520801864048,0.00029831615393050015
27
+ 50706,172538200064,1.912511978149414,54.81750936548799,28066.564795129852,0.00029814810841344297
28
+ 52358,179467190272,1.907254514694214,57.04858303080124,29208.874511770235,0.0002979890559799969
29
+ 53857,185754451968,1.9000687837600707,59.087224691490626,30252.6590420432,0.00029783911304548383
30
+ 56177,195485237248,1.897755184173584,62.22864813516615,31861.067845205067,0.0002975965035147965
31
+ 58449,205014695936,1.8944290637969972,65.2858384233869,33426.349272774096,0.0002973465307150036
32
+ 60122,212031766528,1.885751051902771,67.54159698311844,34581.29765535664,0.00029715464916080236
33
+ 62532,222140039168,1.8851725578308105,70.83456301174779,36267.29626201487,0.0002968665794469416
34
+ 64820,231736606720,1.875647120475769,73.90514303951494,37839.43323623165,0.0002965803723782301
35
+ 67145,241488363520,1.8717713451385498,77.07916604916753,39464.53301717377,0.0002962769358418882
36
+ 69418,251022016512,1.8673745584487915,80.13863182021647,41030.979491950835,0.00029596799868158996
37
+ 70813,256873070592,1.8599135208129882,82.01500265421159,41991.68135895633,0.0002957723627332598
38
+ 73257,267123949568,1.8653322410583497,85.29531497185086,43671.20126558764,0.00029541869298554957
39
+ 74982,274359123968,1.8608362674713135,87.62793199229114,44865.50118005306,0.0002951606293208897
40
+ 76878,282311524352,1.8511997079849243,90.18867519614679,46176.601700427156,0.00029486900893971324
41
+ 79546,293501927424,1.8491973686218262,93.81005986021884,48030.75064843205,0.0002944445004686713
42
+ 81690,302494515200,1.8464100503921508,96.70326836685862,49512.073403831615,0.00029409138369373977
43
+ 83549,310291726336,1.847438826560974,99.1904628869521,50785.51699811948,0.0002937766257673502
44
+ 86305,321851228160,1.8410191345214844,102.87963783826878,52674.374573193614,0.00029329530661925673
45
+ 87821,328209793024,1.8423179483413696,104.95056398590805,53734.68876078492,0.0002930230984929949
46
+ 89624,335772123136,1.8375876760482788,107.36518641720015,54970.97544560648,0.00029269250808283687
47
+ 91624,344160731136,1.8310838747024536,110.08795571576191,56365.0333264701,0.00029231709777377546
48
+ 94363,355648929792,1.8548078060150146,113.81864126870086,58275.14432957484,0.0002917881647590548
49
+ 96400,364192727040,1.8301151275634766,116.56662833875006,59682.11370944003,0.000291383737931028
50
+ 98285,372098990080,1.825421872138977,119.117614549062,60988.21864911974,0.0002910011389758438
51
+ 99761,378289782784,1.828107204437256,121.11521686213217,62010.99103341167,0.0002906959562096745
52
+ 101660,386254766080,1.8232884979248047,123.67062551266181,63319.36026248285,0.0002902960986830294
53
+ 102797,391023689728,1.8284022760391236,125.20398345403945,64104.439528468196,0.0002900528197642416
54
+ 104466,398023983104,1.8237161207199097,127.48315444020703,65271.375073386,0.0002896904479712248
55
+ 106544,406739746816,1.8209293365478516,130.37456855220438,66751.77909872864,0.0002892306074500084
56
+ 108654,415589728256,1.813926682472229,133.23329634167231,68215.44772693623,0.0002887538284994662
57
+ 110464,423181418496,1.8184159755706788,135.6778559462546,69467.06224448235,0.00028833700343966484
58
+ 113489,435869188096,1.8118912790502821,139.763474430842,71558.8989085911,0.0002876241924241185
59
+ 115252,443263746048,1.8117296981811524,142.15046140346098,72781.03623857202,0.0002871994802262634
60
+ 116687,449282572288,1.8083889770507813,144.07612540268525,73766.97620617485,0.00028684877906925976
61
+ 119106,459428593664,1.8078338527679443,147.34195422902133,75439.08056525892,0.0002862474066205323
62
+ 121073,467678789632,1.808600254058838,149.9825934684281,76791.08785583518,0.00028574903262779117
63
+ 122958,475585052672,1.7992388534545898,152.52741438835127,78094.03616683585,0.0002852636098396033
64
+ 124630,482597928960,1.8028169393539428,154.7699610268964,79242.22004577096,0.000284826586721465
65
+ 126358,489845686272,1.8049570083618165,157.08673302915514,80428.40731092743,0.00028436866705305874
66
+ 128236,497722589184,1.8046081829071046,159.62801352390318,81729.54292423843,0.00028386374469846487
67
+ 130926,509005266944,1.7990764093399048,163.24389280245367,83580.87311485628,0.0002831274177879095
68
+ 132497,515594518528,1.794449429512024,165.37276175601704,84670.85401908073,0.0002826903073582798
69
+ 135020,526176747520,1.799844126701355,168.77996288452678,86415.34099687771,0.0002819774381350726
70
+ 136777,533546139648,1.7945206785202026,171.14425289637555,87625.85748294428,0.0002814731269609183
71
+ 139067,543151095808,1.7959566926956176,174.2254043554347,89203.40702998257,0.00028080615447834134
72
+ 140573,549467717632,1.8007401327292125,176.28781320277358,90259.36035982007,0.00028036159346811473
73
+ 142990,559605350400,1.790226879119873,179.52959129813544,91919.15074464535,0.00027963833417743444
74
+ 144717,566848913408,1.7950389575958252,181.85608978862726,93110.31797177716,0.00027911417419090867
75
+ 147235,577410170880,1.787166004180908,185.26090506606388,94853.58339382471,0.0002783390518743545
76
+ 149646,587522637824,1.7874946737289428,188.52470080996864,96524.64681470394,0.00027758482610806823
77
+ 152254,598461382656,1.7876266622543335,192.05541756899714,98332.37379532654,0.00027675574528984725
78
+ 153497,603674902528,1.7870644330978394,193.73033408004696,99189.93104898404,0.0002763558004517108
79
+ 155091,610360623104,1.7821037721633912,195.90790076383007,100304.845191081,0.00027583842165768147
80
+ 157080,618703093760,1.7816897821426392,198.57830149684983,101672.09036638711,0.00027518573915585876
81
+ 159083,627104284672,1.7820564126968383,201.2724561886386,103051.49756858297,0.00027452060021460056
82
+ 160774,634196852736,1.7861653804779052,203.56257374424416,104224.03775705301,0.00027395293000154197
83
+ 162912,643164274688,1.7875264167785645,206.45703154702142,105706.00015207496,0.00027322719688527286
84
+ 164504,649841606656,1.7890281875928242,208.59348014120815,106799.86183229857,0.00027268106350675225
85
+ 166091,656497967104,1.7810548543930054,210.73933679447862,107898.54043877305,0.0002721317869145423
86
+ 168208,665377308672,1.7790726804733277,213.599654382709,109363.023043947,0.0002713915309868753
87
+ 170798,676240556032,1.7792628765106202,217.10682849737182,111158.69619065437,0.0002704742655623704
88
+ 173178,686222999552,1.7723922634124756,220.3199128641763,112803.79538645827,0.00026962021365761757
89
+ 175477,695865704448,1.7773733282089232,223.42362898933956,114392.89804254186,0.0002687851374503225
90
+ 177319,703591612416,1.7751788663864136,225.91816708412628,115670.10154707266,0.0002681089681573212
91
+ 179285,711837614080,1.773937292098999,228.55287790967017,117019.07348975113,0.00026738038286566734
92
+ 181269,720159113216,1.7757549047470094,231.22714584410738,118388.29867218298,0.00026663794415071607
93
+ 182472,725204860928,1.7724720859527587,232.86696931923933,119227.88829145054,0.00026618424453772604
94
+ 183903,731206909952,1.7717243003845216,234.78111301701074,120207.9298647095,0.0002656411670614034
95
+ 186140,740589568000,1.7713909292221068,237.79312360069395,121750.0792835553,0.00026478481595404446
96
+ 188681,751247294464,1.7675371074676514,241.22784125286,123508.65472146432,0.00026380125200375915
97
+ 190733,759854006272,1.770745587348938,243.9752635411917,124915.33493309015,0.00026299862656742334
98
+ 192470,767139512320,1.7692016410827636,246.32182628229677,126116.77505653595,0.00026231343508698046
99
+ 194277,774718619648,1.7701459550857543,248.78614687347053,127378.50719921691,0.00026159503613598645
100
+ 195792,781072990208,1.7644649791717528,250.8326970803534,128426.34090514094,0.00026098836679011583
101
+ 198006,790359179264,1.7638875579833984,253.84096263601015,129966.5728696372,0.0002600947336759418
102
+ 199191,795329429504,1.7611687517166137,255.44903973567233,130789.90834466423,0.0002596129779703915
103
+ 201402,804603035648,1.7673775005340575,258.4134064139623,132307.6640839487,0.0002587077615316957
104
+ 203281,812484132864,1.7651812601089478,260.93973858808965,133601.1461571019,0.00025793202803470194
105
+ 204903,819287293952,1.7612399768829345,263.1386813809225,134727.00486703232,0.00025725766317918897
106
+ 206833,827382300672,1.7639558458328246,265.7458362247564,136061.86814707526,0.00025644959532655776
107
+ 209473,838455263232,1.7610173416137695,269.30228242161377,137882.76859986625,0.00025533439475111663
108
+ 211312,846168588288,1.759269905090332,271.79195232891027,139157.47959240206,0.00025455086142756045
109
+ 213227,854200680448,1.7630469226837158,274.3732711282936,140479.11481768632,0.00025372920208610594
110
+ 215847,865189756928,1.7609153509140014,277.9089087475696,142289.36127875565,0.00025259560788981616
111
+ 217356,871518961664,1.7579796981811524,279.9560044781967,143337.4742928367,0.0002519378322176635
112
+ 219172,879135817728,1.7634607887268066,282.2283169674041,144500.8982873109,0.0002511414932087064
113
+ 222150,891626455040,1.7575871229171753,285.76951404988944,146313.9911935434,0.000249824661295861
114
+ 223853,898769354752,1.758929591178894,287.8028389246148,147355.05352940276,0.000249065546086058
115
+ 225145,904188395520,1.7552893447875977,289.3465113940193,148145.41383373787,0.00024848670000210404
116
+ 226928,911666839552,1.751872878074646,291.46764839194356,149231.4359766751,0.00024768381263129413
117
+ 227290,913185177600,1.756528417269389,291.96401525144137,149485.57580873798,0.00024752022000029683
118
+ 229251,921410207744,1.7513698720932007,294.6090092848358,150839.81275383593,0.0002466306905262172
119
+ 231418,930499264512,1.7536880302429199,297.51891305445406,152329.68348388048,0.00024564118939451873
120
+ 232756,936111243264,1.7567100238800049,299.3165955686522,153250.09693114992,0.0002450268075335771
121
+ 234747,944462102528,1.7647992753982544,301.9957619307597,154621.83010854898,0.00024410788319073617
122
+ 236656,952469028864,1.759487557411194,304.57088663315164,155940.29395617364,0.00024322151148226112
123
+ 238174,958835982336,1.760659966468811,306.609707181298,156984.17007682458,0.00024251305148936808
124
+ 240473,968478687232,1.754089126586914,309.6952503732213,158563.9681910893,0.00024143399787135422
125
+ 242768,978104614912,1.7523606967926026,312.7955168953312,160151.30465040958,0.00024034960370045155
126
+ 244243,984291213312,1.7513489294052125,314.7860534490736,161170.4593659257,0.00023964889987837523
127
+ 245978,991568330752,1.7525734424591064,317.14563100169306,162378.56307286685,0.0002388209686614573
128
+ 248018,1000124710912,1.7499562788009644,319.9244360278179,163801.31124624275,0.00023784241057001054
129
+ 249886,1007959670784,1.7509011316299439,322.45674101591925,165097.85140015066,0.00023694158880971372
130
+ 251357,1014129491968,1.7487824440002442,324.4791230457826,166133.3109994407,0.00023622905428055674
131
+ 252908,1020634857472,1.7466645431518555,326.58318169927685,167210.58903002975,0.00023547477030660957
132
+ 255231,1030378225664,1.7461054420471191,329.7483051645596,168831.13224425452,0.00023433937167283148
133
+ 256784,1036891979776,1.7455737209320068,331.87937631953497,169922.2406756019,0.00023357658938039094
134
+ 258875,1045662269440,1.7457578039169313,334.731770063507,171382.66627251558,0.00023254487314261496
135
+ 260952,1054373838848,1.7445444059371948,337.5729135723391,172837.3317490376,0.00023151483037509024
136
+ 262527,1060979867648,1.741570553779602,339.7600884215228,173957.1652718197,0.00023073032207321376
137
+ 264811,1070559657984,1.740293960571289,342.88246244716737,175555.8207729497,0.00022958747285883874
138
+ 266420,1077308293120,1.7429306316375732,345.0796831538715,176680.7977747822,0.00022877875017002225
139
+ 268308,1085227139072,1.744356451034546,347.65395848590146,177998.82674478155,0.00022782600717619061
140
+ 269559,1090474213376,1.7428239250183106,349.3460391367528,178865.17203801742,0.00022719251865055412
141
+ 271809,1099911397376,1.7441408920288086,352.4030745393055,180430.37416412443,0.00022604875266551971
142
+ 274026,1109210169344,1.7409614515304566,355.4137426253769,181971.836224193,0.00022491635172627866
143
+ 276200,1118328586240,1.7383298921585082,358.35239492776185,183476.42620301407,0.00022380080190487206
144
+ 279273,1131217682432,1.7371133943883383,362.5356638252912,185618.25987854908,0.00022221545805223286
145
+ 281879,1142148038656,1.7415697383880615,366.08764050357314,187436.87193782945,0.00022086345416028053
146
+ 283803,1150217879552,1.733839235305786,368.711490408947,188780.28308938086,0.0002198608999606222
147
+ 286646,1162142285824,1.7431989669799806,372.57302363409656,190757.38810065744,0.00021837285021319985
148
+ 288431,1169629118464,1.739035325050354,375.010250589488,192005.24830181786,0.00021743458637502044
149
+ 289869,1175660527616,1.733289074897766,376.974500554174,193010.9442837371,0.00021667654800694436
150
+ 292495,1186674769920,1.7315478420257568,380.5901660706151,194862.16502815494,0.00021528734941966832
151
+ 295211,1198066499584,1.733106060028076,384.2868979963938,196754.89177415363,0.00021384400315582752
152
+ 297195,1206387998720,1.7410508108139038,386.9830853080197,198135.3396777061,0.00021278555504977703
153
+ 298925,1213644144640,1.734706358909607,389.3362156377362,199340.14240652093,0.00021185987861827016
154
+ 300889,1221881757696,1.73500732421875,392.0144997620718,200711.42387818077,0.00021080594160594046
155
+ 302662,1229318258688,1.7418826770782472,394.42366878365414,201944.91841723092,0.00020985178707633168
156
+ 304473,1236914143232,1.7310642719268798,396.8901623202001,203207.76310794245,0.00020887458231300116
157
+ 306476,1245315334144,1.7335577774047852,399.62016656875767,204605.52528320393,0.0002077907556667924
158
+ 308518,1253880102912,1.7285248136520386,402.42136555782304,206039.7391656054,0.00020668267097789794
159
+ 310574,1262503591936,1.733399453163147,405.2239282874955,207474.6512831977,0.00020556384697556496
160
+ 312781,1271760420864,1.730285539627075,408.3160193839066,209057.8019245602,0.00020435944315977395
161
+ 314612,1279440191488,1.733029899597168,410.81628584431604,210337.9383522898,0.0002033576020039618
162
+ 316647,1287975600128,1.731144299507141,413.5669219080583,211746.26401692585,0.00020224145555403084
163
+ 319268,1298968870912,1.729205231666565,416.9754043734229,213491.40703919253,0.00020079984096810222
164
+ 320942,1305990135808,1.7270024967193605,418.99536520049196,214525.62698265188,0.00019987679843325168
165
+ 323057,1314861088768,1.7282018089294433,421.5389839125796,215827.95976324077,0.0001987080613616854
166
+ 324688,1321701998592,1.727188115119934,423.4974892227614,216830.71448205382,0.0001978049403987825
167
+ 325782,1326290567168,1.7273711681365966,424.81380970657455,217504.67056976617,0.00019719830015674233
168
+ 328381,1337191563264,1.7251214504241943,427.95426755280107,219112.58498703415,0.00019575434271246195
169
+ 330172,1344703561728,1.7238556051254272,430.13040363881544,220226.7666630735,0.0001947571145137772
170
+ 332386,1353989750784,1.7249505424499512,432.82640173550084,221607.11768857643,0.00019352202070876956
171
+ 334457,1362676154368,1.7219355773925782,435.3433918957347,222895.81665061618,0.00019236441585235298
172
+ 336042,1369324126208,1.72216224193573,437.23262586380747,223863.10444226942,0.0001914770546136424
173
+ 337560,1375691079680,1.7201724815368653,439.0764706419921,224807.15296869996,0.00019062607316300273
174
+ 338748,1380673912832,1.7237060013271512,440.51795167100903,225545.19125555662,0.00018995934806298465
175
+ 340933,1389838467072,1.7192433309555053,443.16397790562996,226899.95668768254,0.00018873147200793028
176
+ 342332,1395706298368,1.7205400276184082,444.8788870963792,227777.99019334614,0.00018794421339407563
177
+ 344302,1403969077248,1.7203184127807618,447.26131719529656,228997.79440399184,0.00018683428061194718
178
+ 345968,1410956787712,1.720102686882019,449.29559398623377,230039.3441209517,0.0001858944451669231
179
+ 347502,1417390850048,1.720639853477478,451.16690631718046,230997.4560343964,0.0001850281551014632
180
+ 349991,1427830472704,1.7197772884368896,454.2264308151358,232563.93257734954,0.00018362075206823647
181
+ 352052,1436474933248,1.7142712354660035,456.81043543511663,233886.94294277971,0.00018245380488224328
182
+ 354054,1444871929856,1.7160462188720702,459.453295596552,235240.08734543461,0.00018131898832507432
183
+ 355452,1450735566848,1.7196543836593627,461.30180067648325,236186.52194635943,0.0001805258507374674
184
+ 357345,1458675384320,1.7139066517353059,463.9481888459466,237541.47268912467,0.00017945100262295455
185
+ 359270,1466749419520,1.7155957555770873,466.5414323894858,238869.21338341673,0.00017835704784374684
186
+ 361118,1474500493312,1.7160774993896484,469.038111011795,240147.51283803905,0.0001773060066625476
187
+ 363330,1483778293760,1.716701912879944,472.0381865977867,241683.5515380668,0.0001760469749569893
188
+ 365934,1494700261376,1.7132270240783691,475.5838090708114,243498.91024425544,0.0001745635672705248
189
+ 367689,1502061264896,1.7156493997573852,477.96177319155544,244716.42787407638,0.00017356315220240504
190
+ 369392,1509204164608,1.7158074569702149,480.26144646743097,245893.86059132466,0.00017259192827623338
191
+ 371001,1515952799744,1.715286192893982,482.4363151349443,247007.3933490915,0.00017167393525596708
192
+ 373275,1525490647040,1.712519268989563,485.5007621892336,248576.39024088762,0.0001703760353848338
193
+ 375882,1536425197568,1.7095552492141723,489.0597850280192,250398.60993434582,0.00016888746176846325
194
+ 378308,1546600579072,1.7065624713897705,492.38097163360044,252099.05747640342,0.0001675017992965877
195
+ 380320,1555039518720,1.7018836069107055,495.12656034960173,253504.7988989961,0.00016635240172035992
196
+ 381423,1559665836032,1.712470350265503,496.61899996729517,254268.92798325513,0.00016572224558331072
197
+ 384180,1571229532160,1.70383526802063,500.35769686916666,256183.14079701333,0.0001641471026232466
198
+ 386393,1580511526912,1.7026185607910156,503.3778410258571,257729.45460523883,0.0001628828322282061
199
+ 388342,1588686225408,1.703223738670349,506.02086480091833,259082.6827780702,0.00016176952340174466
200
+ 389711,1594428227584,1.7003657007217408,507.8724601782208,260030.69961124906,0.00016098766354843974
201
+ 391539,1602095415296,1.6955804538726806,510.35535180901985,261301.94012621816,0.000159943854669109
202
+ 393541,1610492411904,1.6995060205459596,513.0894965861288,262701.82225209795,0.00015880104911047965
203
+ 395741,1619719880704,1.7011328125,516.0827098217362,264234.34742872894,0.00015754574269521981
204
+ 398543,1631472320512,1.7008357238769531,519.8783201021156,266177.6998922832,0.000155947869643569
205
+ 400565,1639953203200,1.695354881286621,522.6083451644168,267575.4727241814,0.000154795590788126
206
+ 402527,1648182427648,1.6946661138534547,525.2668895711776,268936.6474604429,0.00015367820742540061
207
+ 403945,1654129950720,1.6960719108581543,527.1914807180993,269922.03812766686,0.00015287112910300493
208
+ 405313,1659867758592,1.697601842880249,529.0370912977813,270866.99074446404,0.0001520929072285071
209
+ 406788,1666054356992,1.6927887296676636,531.0349452814536,271889.89198410424,0.000151254324009642
210
+ 409387,1676955353088,1.69403892993927,534.5653393185967,273697.4537311215,0.00014977800310589373
211
+ 411155,1684370882560,1.691395869255066,536.9709613285441,274929.1322002146,0.0001487747795181349
212
+ 413089,1692482666496,1.6922231245040893,539.5964738321247,276273.39460204786,0.00014767838001716882
213
+ 414732,1699373907968,1.6947362184524537,541.7996070183738,277401.39879340737,0.00014674787234980613
214
+ 416436,1706521001984,1.6899096250534058,544.0972045027258,278577.76870539563,0.0001457837497582659
215
+ 417863,1712506273792,1.6881012392044068,545.9991046208406,279551.5415658704,0.0001449771225452423
216
+ 419237,1718269247488,1.691229190826416,547.8411327792202,280494.65998296073,0.00014420114166568965
217
+ 421285,1726859182080,1.6884182071685792,550.5797526560408,281896.8333598929,0.00014304582145996392
218
+ 424030,1738372546560,1.6905427885055542,554.2562450989998,283779.1974906879,0.0001414999132975936
219
+ 426068,1746920538112,1.6902626466751098,557.0390545870044,285203.9959485463,0.00014035419735591859
220
+ 427364,1752356356096,1.6865596675872803,558.7964797184435,286103.79761584307,0.0001396265724906698
221
+ 429281,1760396836864,1.6867563104629517,561.412200978866,287443.0469011794,0.00013855169527232647
222
+ 432073,1772107333632,1.6869472122192384,565.1983350192647,289381.54752986354,0.0001369893434457481
223
+ 433564,1778361040896,1.6836950588226318,567.304830058465,290460.07298993407,0.00013615658099297434
224
+ 435451,1786275692544,1.682388458251953,569.8640171882282,291770.37680037285,0.0001351043174508959
225
+ 436975,1792667811840,1.6846920680999755,571.9226006626227,292824.3715392628,0.00013425585348159075
226
+ 439052,1801379381248,1.6854845952987672,574.7291045055265,294261.30150682956,0.00013310158101376146
227
+ 440966,1809407279104,1.6832574605941772,577.334748714053,295595.39134159515,0.000132040077005513
228
+ 443227,1818890600448,1.6824970388412475,580.0256552258438,296973.13547563204,0.00013078891788609326
229
+ 445179,1827077881856,1.6792938709259033,582.2320406351785,298102.8048052114,0.00012971126125194132
230
+ 447934,1838633189376,1.6817043399810792,585.2915628118471,299669.28015966574,0.00012819441326428205
231
+ 449710,1846082273280,1.6782426357269287,587.2799727474821,300687.34604671085,0.00012721921666525304
232
+ 452524,1857885044736,1.6820737409591675,590.4152128734677,302292.58899121545,0.00012567844532895833
233
+ 454426,1865862610944,1.6787804031372071,592.544702035982,303382.8874424228,0.00012464018072932959
234
+ 456289,1873676599296,1.681654314994812,594.6135522814485,304442.13876810163,0.00012362573761492968
235
+ 458583,1883298332672,1.6730261087417602,597.1887328853452,305760.63123729674,0.00012238013732712716
236
+ 460981,1893356273664,1.672509970664978,599.8846338122594,307140.9325118768,0.00012108237569918856
237
+ 462659,1900394315776,1.6751681089401245,601.761152752987,308101.7102095293,0.00012017694825772196
238
+ 464445,1907885342720,1.671654839515686,603.7396514160039,309114.701524994,0.0001192157287732698
239
+ 465759,1913396658176,1.6746321821212768,605.2131596599303,309869.1377458843,0.00011851020099129528
240
+ 467953,1922598961152,1.6700562286376952,607.6498691665898,311116.733013294,0.00011733539577107877
241
+ 470480,1933197967360,1.6706900882720948,610.4716868327131,312561.5036583491,0.000115987379103899
242
+ 472362,1941091647488,1.670096197128296,612.5532063371096,313627.2416446001,0.00011498706589918584
243
+ 474479,1949970989056,1.672633442878723,614.9096566533404,314833.7442065103,0.00011386564437998459
244
+ 475718,1955167731712,1.670849027633667,616.2887570926199,315539.8436314214,0.00011321121564833447
245
+ 477678,1963388567552,1.6683212041854858,618.4896738461576,316666.7130092327,0.00011217887367820367
246
+ 479405,1970632130560,1.6645479679107666,620.5010973832233,317696.5618602103,0.00011127226753160357
247
+ 481490,1979377254400,1.6670279264450074,622.9267749258964,318938.508762059,0.00011018155055353418
248
+ 483395,1987367403520,1.6700633478164673,625.1531309454637,320078.4030440774,0.00010918873158516362
249
+ 484660,1992673198080,1.665715732574463,626.6322892833138,320835.73211305664,0.00010853145067812875
250
+ 486388,1999920955392,1.6684691619873047,628.6401761626622,321863.77019528305,0.00010763623140519485
251
+ 487760,2005675540480,1.6899535751342774,630.2428063296861,322684.3168407993,0.00010692761861719191
252
+ 489679,2013724409856,1.6870677042007447,632.4661689389804,323822.678496758,0.00010593978367978707
253
+ 491707,2022230458368,1.683250117301941,634.8101256682668,325022.7843421526,0.00010490007844055071
254
+ 493795,2030988165120,1.6834172248840331,637.2431921455864,326268.5143785402,0.00010383423796156421
255
+ 495562,2038399500288,1.6794629859924317,639.3029817152653,327323.12663821585,0.00010293597733834758
256
+ 497249,2045475291136,1.6757073497772217,641.2957346101667,328343.41612040537,0.00010208162711933255
257
+ 499270,2053951979520,1.6750809478759765,643.6868357490317,329567.65990350425,0.00010106235276907682
258
+ 501917,2065054302208,1.6795844745635986,646.8346841992168,331179.358309999,9.973444684874266e-05
259
+ 504501,2075892383744,1.6791791486740113,649.9504268523947,332774.6185484261,9.844604937825352e-05
260
+ 505920,2081844101120,1.6794115686416626,651.6345685349747,333636.89908990706,9.774190402822569e-05
261
+ 507735,2089456762880,1.6736989212036133,653.8060941808081,334748.72022057377,9.684478573035449e-05
262
+ 509781,2098038308864,1.6739554023742675,656.2464802015245,335998.19786318054,9.583831706549972e-05
263
+ 511352,2104627560448,1.6752780342102052,658.1126194182707,336953.6611421546,9.50690227909945e-05
264
+ 512792,2110667358208,1.6734393405914307,659.8307014095916,337833.3191217109,9.436659456696361e-05
265
+ 514516,2117898338304,1.6712262153625488,661.8950038467839,338890.24196955335,9.352908818982542e-05
266
+ 516978,2128224714752,1.670324263572693,664.8669427431114,340411.87468447303,9.233966557076201e-05
267
+ 519037,2136860786688,1.6687762260437011,667.3400292872751,341678.09499508486,9.135099389823154e-05
268
+ 520968,2144959987712,1.6679233884811402,669.6824876479714,342877.43367576133,9.042886085808277e-05
269
+ 523402,2155168923648,1.6661997795104981,672.6088669252829,344375.73986574484,8.927362796384841e-05
270
+ 525489,2163922436096,1.667925977706909,675.1255832034733,345664.2986001783,8.828948193695396e-05
271
+ 527861,2173871325184,1.6650946950912475,677.9682576375237,347119.74791041214,8.71782103786245e-05
272
+ 529924,2182524174336,1.6656306838989259,680.4565219490987,348393.73923793854,8.621808228781447e-05
273
+ 531732,2190107475968,1.6626214647293092,682.6502407960503,349516.92328757775,8.538155816495419e-05
274
+ 533512,2197573337088,1.6671656608581542,684.8457100742603,350641.0035580213,8.456254727207124e-05
275
+ 535979,2207920685056,1.6623902940750122,687.8088072102388,352158.10929164226,8.343499212060124e-05
276
+ 538441,2218247061504,1.6625070142745972,690.8060061686901,353692.67515836935,8.231859101215377e-05
277
+ 539611,2223154397184,1.6638607454299927,692.2126377288533,354412.8705171729,8.179118594853207e-05
278
+ 541835,2232482529280,1.6632291555404664,694.9052978233159,355791.51248553774,8.07943069958128e-05
279
+ 543513,2239520571392,1.6591477632522582,696.9350562256079,356830.74878751126,8.004709525266662e-05
280
+ 544834,2245061246976,1.6582461643218993,698.5149932145702,357639.6765258599,7.946186815388501e-05
281
+ 546541,2252220923904,1.6569991302490235,700.5529090210858,358683.08941879595,7.870959962019697e-05
282
+ 548386,2259959414784,1.6534462308883666,702.7537166691092,359809.9029345839,7.790157542331144e-05
283
+ 550107,2267177811968,1.6563579320907593,704.8354743227627,360875.7628532545,7.715264655416831e-05
284
+ 551929,2274819833856,1.652633171081543,707.0277662998835,361998.21634554036,7.636484951945022e-05
285
+ 553695,2282226974720,1.6558599662780762,709.1717543189402,363095.9382112974,7.560628728242591e-05
286
+ 555203,2288551985152,1.6531555843353272,710.9832715057181,364023.4350109277,7.496249600080773e-05
287
+ 557414,2297825591296,1.658251905441284,713.6318656351355,365379.51520518935,7.402522169286385e-05
288
+ 559184,2305249509376,1.6528305721282959,715.7625381680682,366470.4195420509,7.328063657041639e-05
289
+ 561554,2315190009856,1.6525710678100587,718.5529555138435,367899.11322308786,7.229171023936942e-05
290
+ 562859,2320663576576,1.651809163093567,720.0850926347636,368683.56742899894,7.175114296842366e-05
291
+ 565169,2330352418816,1.6541441106796264,722.8299401458767,370088.92935468885,7.080127397784963e-05
292
+ 567134,2338594226176,1.6530265474319459,725.1825567879315,371293.46907542093,7.000035111559555e-05
293
+ 569714,2349415530496,1.6502766132354736,728.2318701144516,372854.7174985992,6.895873957546428e-05
294
+ 571324,2356168359936,1.651259379386902,730.1611943364807,373842.5315002781,6.831453356426209e-05
295
+ 574248,2368432504832,1.6482793807983398,733.6468108414203,375627.1671508072,6.715607014484704e-05
296
+ 576334,2377181822976,1.644599723815918,736.1349917772365,376901.11578994506,6.63387545500882e-05
297
+ 578782,2387449479168,1.6505870151519775,739.0739024216011,378405.8380398598,6.538941670442e-05
298
+ 580516,2394722402304,1.6470952558517455,741.1653197225369,379476.6436979389,6.472343375207856e-05
299
+ 582668,2403748544512,1.6487390184402466,743.7736260008398,380812.09651243,6.390442285919562e-05
300
+ 584669,2412141346816,1.6448141527175903,746.1684989620086,382038.2714685484,6.315039354376495e-05
301
+ 586818,2421154906112,1.646544461250305,748.775306891994,383372.9571287009,6.234873580979183e-05
302
+ 589001,2430311071744,1.6475654029846192,751.4020625109125,384717.8560055872,6.154309085104614e-05
303
+ 591113,2439169441792,1.6450910377502441,753.9527754451661,386023.82102792506,6.077205034671351e-05
304
+ 593228,2448040394752,1.6460276412963868,756.5354509950179,387346.15090944915,6.0008260334143415e-05
305
+ 594605,2453815951360,1.6390350008010863,758.1939404388438,388195.29750468803,5.951549974270165e-05
306
+ 597153,2464503037952,1.637695927619934,761.2579777614424,389764.0846138585,5.8613157307263464e-05
307
+ 598759,2471239090176,1.6295817375183106,763.2197744939834,390768.5245409195,5.8050762163475156e-05
308
+ 600863,2480063905792,1.6312317228317261,765.7391971130864,392058.46892190026,5.732145291403867e-05
309
+ 602611,2487395549184,1.6288472127914428,767.8321921499497,393130.08238077426,5.67220376979094e-05
310
+ 604453,2495121457152,1.6286871242523193,770.052846124808,394267.0572159017,5.609680010820739e-05
311
+ 606186,2502390185984,1.626492328643799,772.1559839896788,395343.86380271555,5.5514599807793275e-05
312
+ 606868,2505250701312,1.6294204493363698,772.9715416562736,395761.4293280121,5.528709516511299e-05
313
+ 608527,2512209051648,1.6318567752838136,774.9540150309879,396776.4556958658,5.4737502068746835e-05
314
+ 609912,2518018162688,1.6299163818359375,776.6102461643201,397624.4460361319,5.4282838391372934e-05
315
+ 612696,2529695105024,1.6283214426040649,779.9288053733754,399323.5483511682,5.338044138625264e-05
316
+ 614703,2538113073152,1.624042534828186,782.3509216785588,400563.67189942213,5.2739502280019224e-05
317
+ 617059,2547994853376,1.64087806224823,785.185724072868,402015.0907253084,5.19974491908215e-05
318
+ 618445,2553808158720,1.642133264541626,786.8536925417527,402869.0905813774,5.156615225132555e-05
319
+ 620429,2562129657856,1.6360367107391358,789.3145963788388,404129.0733459655,5.0955564802279696e-05
320
+ 622331,2570107224064,1.6323289918899535,791.575088748329,405286.44543914445,5.0377762818243355e-05
321
+ 624509,2579242418176,1.6336705684661865,794.1814413488138,406620.89797059266,4.9725240387488157e-05
322
+ 625869,2584946671616,1.63454927444458,795.8170756094358,407458.34271203115,4.932274896418676e-05
323
+ 628359,2595390488576,1.6327994871139526,798.8134971637937,408992.5105478624,4.8595778935123235e-05
324
+ 629947,2602051043328,1.634876675605774,800.7342204672283,409975.9208792209,4.8138899728655815e-05
325
+ 631685,2609340743680,1.6314793634414673,802.8435091534811,411055.87668658234,4.764491313835606e-05
326
+ 633406,2616559140864,1.630139570236206,804.9092638572018,412113.54309488734,4.716201510746032e-05
327
+ 635185,2624020807680,1.6301208162307739,807.085845822091,413227.9530609106,4.666941094910726e-05
328
+ 636953,2631436337152,1.6280076551437377,809.2236022954875,414322.4843752896,4.61865020042751e-05
329
+ 638743,2638944141312,1.6293576526641846,811.3632367052068,415417.97719306586,4.570435703499243e-05
330
+ 641116,2648897224704,1.6281150197982788,814.1943204899674,416867.4920908633,4.507573976297863e-05
331
+ 643518,2658971942912,1.626808156967163,817.0790731657314,418344.4854608545,4.4451757275965065e-05
332
+ 645200,2666026762240,1.6276088953018188,819.1059377729686,419382.2401397599,4.402222475619055e-05
333
+ 646903,2673169661952,1.6250265979766845,821.1531966983101,420430.4367095348,4.3593576265266165e-05
334
+ 649246,2682996916224,1.6259414958953857,823.9828252710962,421879.20653880126,4.3014148104703054e-05
335
+ 652174,2695277838336,1.6242961645126344,827.4949100539828,423677.3939476392,4.2306914110668004e-05
336
+ 654123,2703452536832,1.623875789642334,829.8569429315521,424886.7547809547,4.184658610029146e-05
337
+ 656975,2715414691840,1.621957221031189,833.2863375114017,426642.6048058377,4.1188093746313825e-05
338
+ 658694,2722624700416,1.626935167312622,835.3677633732899,427708.29484712443,4.07998995797243e-05
339
+ 660347,2729557884928,1.6193980121612548,837.4232649987488,428760.7116793594,4.0432812966173515e-05
340
+ 662381,2738089099264,1.6226060914993286,839.8879330915463,430022.6217428717,3.9989481592783704e-05
341
+ 664665,2747668889600,1.6181461000442505,842.6497982943592,431436.69672671193,3.9502705476479605e-05
342
+ 666543,2755545792512,1.6224317026138306,844.9222846863987,432600.20975943614,3.91112407669425e-05
343
+ 668035,2761803694080,1.6199790287017821,846.7305949538775,433526.06461638527,3.8805901567684487e-05
344
+ 669716,2768854319104,1.6205505228042603,848.7580324526521,434564.11261575785,3.8467911508632824e-05
345
+ 671548,2776538284032,1.6194300413131715,850.955616959009,435689.2758830126,3.81068566639442e-05
346
+ 673792,2785950302208,1.618673267364502,853.719618993503,437104.44492467353,3.767499583773315e-05
347
+ 675542,2793290334208,1.6200080347061157,855.8460515671908,438193.1784024017,3.7346177123254165e-05
348
+ 677331,2800793944064,1.6195425510406494,858.0345433676414,439313.6862042324,3.7017267459305e-05
349
+ 679068,2808079450112,1.6178715658187866,860.1212525989619,440382.0813306685,3.670493606477976e-05
350
+ 681041,2816354811904,1.619278564453125,862.4906129164967,441595.1938132463,3.6358578654471785e-05
351
+ 682885,2824089108480,1.6154922294616698,864.7152048434652,442734.1848798542,3.6042976716998965e-05
352
+ 684471,2830741274624,1.6204235363006592,866.6373480686542,443718.32221115095,3.577781535568647e-05
353
+ 686548,2839452844032,1.6162836599349975,869.1610365124169,445010.45069435745,3.5439366911305115e-05
354
+ 688512,2847690457088,1.6139143562316896,871.5226151134167,446219.57893806935,3.51285380020272e-05
355
+ 690770,2857161195520,1.6140108585357666,874.2491468152524,447615.56316940923,3.47822715411894e-05
356
+ 692653,2865059069952,1.612514958381653,876.5247202096293,448780.6567473302,3.450260192039423e-05
357
+ 695067,2875184119808,1.6133524417877196,879.4364041243566,450271.4389116706,3.4156193578382954e-05
358
+ 697014,2883350429696,1.613981342315674,881.7887764691106,451475.85355218465,3.3886746678035706e-05
359
+ 698746,2890614964224,1.613626651763916,883.8767033670731,452544.87212394143,3.3654530852800235e-05
360
+ 701452,2901964750848,1.6106493425369264,887.1577838441956,454224.78532822814,3.330586332594976e-05
361
+ 703123,2908973432832,1.6123323392868043,889.1781213319924,455259.1981219801,3.309917883598246e-05
362
+ 704756,2915822731264,1.6120843172073365,891.1229818776349,456254.96672134905,3.2903564715525135e-05
363
+ 706984,2925167640576,1.6084727716445923,893.7894090776833,457620.17744777387,3.264685801696032e-05
364
+ 708614,2932004356096,1.6117757987976074,895.7409758135146,458619.3796165195,3.2466501579619944e-05
365
+ 710902,2941600923648,1.6067533922195434,898.5324710232566,460048.62516390736,3.222398299840279e-05
366
+ 712655,2948953538560,1.6069003009796143,900.6844297494815,461150.42803173454,3.204659151379019e-05
367
+ 714446,2956465537024,1.6114630460739137,902.828141202652,462248.0082957578,3.187291440553963e-05
368
+ 716271,2964120141824,1.6145262241363525,905.0301348181574,463375.4290268966,3.170380659867078e-05
369
+ 718439,2973213392896,1.6167470741271972,907.64100600771,464712.1950759475,3.1513252906734124e-05
370
+ 720429,2981560057856,1.615721011161804,910.0234193265205,465931.9906951785,3.1348234188044444e-05
371
+ 722346,2989600538624,1.6144047689437866,912.3105630227034,467103.0082676241,3.119822940789163e-05
372
+ 724097,2996944764928,1.6113934993743897,914.3946314318039,468170.0512930836,3.106891381321475e-05
373
+ 726062,3005186572288,1.6150265264511108,916.7757618370391,469389.190060564,3.093255145358853e-05
374
+ 728431,3015122878464,1.6120684432983399,919.6071913458889,470838.8819690951,3.078047666349448e-05
375
+ 730800,3025059184640,1.6116199398040771,922.4232348246237,472280.69623020734,3.064189513679594e-05
376
+ 732477,3032093032448,1.60649507522583,924.4540804354708,473320.48918296106,3.0551960662705824e-05
377
+ 734574,3040888487936,1.6101762390136718,927.0469926404139,474648.0602318919,3.0449025871348567e-05
378
+ 736132,3047423213568,1.6150142908096314,929.0061162371555,475651.13151342364,3.0379411327885464e-05
379
+ 737978,3055165898752,1.6060742330551148,931.2157941709505,476782.48661552666,3.0304503525258042e-05
380
+ 740325,3065009930240,1.5952490282058716,934.0287519650691,478222.7210061154,3.0221137421904132e-05
381
+ 741809,3071234277376,1.5957392265922146,935.8210044055174,479140.35425562493,3.0175287974998355e-05
382
+ 744212,3081313189888,1.5954618740081787,938.745490208727,480637.6909868682,3.0112320018815808e-05
383
+ 746818,3092243546112,1.5991949892044068,941.9034764933106,482254.579964575,3.0059802156756632e-05
384
+ 748203,3098052657152,1.5962308692932128,943.6019803061257,483124.21391673636,3.0038570912438445e-05
385
+ 749620,3103995985920,1.597407283782959,945.3358183610009,484011.93900083244,3.0021647035027854e-05
386
+ 751627,3112413954048,1.5941282081604005,947.7966657645534,485271.89287145133,3.0005981898284517e-05
387
+ 753851,3121742086144,1.5939428043365478,950.4965920289866,486654.25511884113,2.9999999242136255e-05
metadata/training_logs/1_pretraining.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:debd5c63735b96a9e62fa5b44b0127c9452c341047ec2b919f82d8612674edce
3
+ size 418213162
metadata/training_logs/2_extension.csv ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
2
+ 25,102400000,4.309772815704346,0.26922979407836667,34.461413642030934,1.9999999494757503e-05
3
+ 50,204800000,2.6095064067840577,0.5344728346386659,68.41252283374924,1.9999999494757503e-05
4
+ 75,307200000,2.1840998935699463,0.7994251677144519,102.32642146744985,1.9999999494757503e-05
5
+ 100,409600000,1.9669664239883422,1.066323601762864,136.48942102564658,1.9999999494757503e-05
6
+ 125,512000000,1.8014992809295653,1.3312230805499679,170.39655431039589,1.9999999494757503e-05
7
+ 150,614400000,1.7220078945159911,1.5958814102547383,204.2728205126065,1.9999999494757503e-05
8
+ 175,716800000,1.6801378870010375,1.8606056269711062,238.1575202523016,1.9999999494757503e-05
9
+ 200,819200000,1.6575293684005736,2.126106606112505,272.14164558240066,1.9999999494757503e-05
10
+ 225,921600000,1.639773211479187,2.391510878373186,306.1133924317678,1.9999999494757503e-05
11
+ 250,1024000000,1.6256405591964722,2.6564186745483864,340.02159034219346,1.9999999494757503e-05
12
+ 275,1126400000,1.614159688949585,2.922717172069702,374.10779802492186,1.9999999494757503e-05
13
+ 300,1228800000,1.6140153646469115,3.187462724955874,407.9952287943519,1.9999999494757503e-05
14
+ 325,1331200000,1.6082664394378663,3.453434134056764,442.0395691592658,1.9999999494757503e-05
15
+ 350,1433600000,1.5916787338256837,3.718426102696622,475.9585411451676,1.9999999494757503e-05
16
+ 375,1536000000,1.5903722620010377,3.9834007768839617,509.8752994411471,1.9999999494757503e-05
17
+ 400,1638400000,1.5937609910964965,4.248280933908213,543.7799595402513,1.9999999494757503e-05
18
+ 425,1740800000,1.5833971071243287,4.514559538923331,577.8636209821864,1.9999999494757503e-05
19
+ 450,1843200000,1.588483600616455,4.7794486881652904,611.7694320851572,1.9999999494757503e-05
20
+ 475,1945600000,1.5811502838134766,5.044273908992154,645.6670603509957,1.9999999494757503e-05
21
+ 500,2048000000,1.5776061773300172,5.308712950952829,679.5152577219621,1.9999999494757503e-05
22
+ 525,2150400000,1.5762306451797485,5.575429370141868,713.6549593781591,1.9999999494757503e-05
23
+ 550,2252800000,1.577330994606018,5.8403744682127785,747.5679319312356,1.9999999494757503e-05
24
+ 575,2355200000,1.5774771738052369,6.105136559348899,781.457479596659,1.9999999494757503e-05
25
+ 600,2457600000,1.571889362335205,6.370161146941192,815.3806268084726,1.9999999494757503e-05
26
+ 625,2560000000,1.5669999837875366,6.636842231150017,849.5158055872022,1.9999999494757503e-05
27
+ 650,2662400000,1.5683012199401856,6.901824726711475,883.4335650190689,1.9999999494757503e-05
28
+ 675,2764800000,1.5606089782714845,7.166626843748192,917.3282359997686,1.9999999494757503e-05
29
+ 700,2867200000,1.569625825881958,7.431464089033584,951.2274033962988,1.9999999494757503e-05
30
+ 725,2969600000,1.5637955999374389,7.697967706009089,985.3398663691634,1.9999999494757503e-05
31
+ 750,3072000000,1.5669568061828614,7.962892068477961,1019.250184765179,1.9999999494757503e-05
32
+ 775,3174400000,1.578919801712036,8.229167067304108,1053.3333846149258,1.9999999494757503e-05
33
+ 800,3276800000,1.5597226810455322,8.493772274765757,1087.2028511700169,1.9999999494757503e-05
34
+ 825,3379200000,1.5684496641159058,8.760366790807014,1121.3269492232978,1.9999999494757503e-05
35
+ 850,3481600000,1.555274577140808,9.025527741439927,1155.2675509043106,1.9999999494757503e-05
36
+ 875,3584000000,1.5589488697052003,9.290825104582503,1189.2256133865603,1.9999999494757503e-05
37
+ 900,3686400000,1.56228581905365,9.555587218982518,1223.1151640297624,1.9999999494757503e-05
38
+ 925,3788800000,1.5693172216415405,9.821702597393193,1257.1779324663287,1.9999999494757503e-05
39
+ 950,3891200000,1.547282567024231,10.086582040948787,1291.0825012414448,1.9999999494757503e-05
40
+ 975,3993600000,1.552180905342102,10.35216691473301,1325.0773650858252,1.9999999494757503e-05
41
+ 1000,4096000000,1.5544623231887817,10.617501541588314,1359.0401973233043,1.9999999494757503e-05
42
+ 1025,4198400000,1.5621129417419433,10.884025346612145,1393.1552443663545,1.9999999494757503e-05
43
+ 1050,4300800000,1.5600895547866822,11.148848565988725,1427.0526164465568,1.9999999494757503e-05
44
+ 1075,4403200000,1.5528885984420777,11.413467440123783,1460.9238323358443,1.9999999494757503e-05
45
+ 1100,4505600000,1.5599483346939087,11.678333667439569,1494.8267094322648,1.9999999494757503e-05
46
+ 1125,4608000000,1.5648639726638793,11.943580217023005,1528.7782677789446,1.9999999494757503e-05
47
+ 1150,4710400000,1.549267168045044,12.20822147437584,1562.6523487201075,1.9999999494757503e-05
48
+ 1175,4812800000,1.5537393379211426,12.473067252154218,1596.5526082757399,1.9999999494757503e-05
49
+ 1200,4915200000,1.5565906286239624,12.737777670184224,1630.4355417835807,1.9999999494757503e-05
50
+ 1220,4997120000,1.549389386177063,12.94956159459549,1657.5438841082228,1.9999999494757503e-05
metadata/training_logs/2_extension/events.out.tfevents.1731919080.jzxh169.2097150.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e922e0c4112bf78d634ff506c400a651620f43e966b11e2a6fe98206c6e9a423
3
+ size 3379212
metadata/training_logs/3_annealing.csv ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
2
+ 25,102400000,1.4206320714950562,0.26831979325136307,34.34493353617447,2.938559919130057e-05
3
+ 50,204800000,1.3607354307174682,0.532874067107453,68.20788058975398,2.8771199140464887e-05
4
+ 75,307200000,1.344892144203186,0.797669098578563,102.10164461805607,2.8156800908618607e-05
5
+ 100,409600000,1.3339987087249756,1.0622922554949321,135.9734087033513,2.7542400857782923e-05
6
+ 125,512000000,1.3272197246551514,1.3275567747589183,169.92726716914154,2.692800080694724e-05
7
+ 150,614400000,1.3201901483535767,1.5918107887810053,203.75178096396868,2.6313600756111555e-05
8
+ 175,716800000,1.3260539388656616,1.856745663885765,237.6634449773779,2.569920070527587e-05
9
+ 200,819200000,1.3183754396438598,2.1223753731163395,271.66404775889146,2.5084800654440187e-05
10
+ 225,921600000,1.3132305669784545,2.3882311399600185,305.69358591488236,2.4470400603604503e-05
11
+ 250,1024000000,1.3129970407485962,2.65354331628931,339.65354448503166,2.385600055276882e-05
12
+ 275,1126400000,1.3064435338973999,2.9190579786122703,373.6394212623706,2.3241600501933135e-05
13
+ 300,1228800000,1.3097565698623657,3.1846308088092967,407.63274352759,2.262720045109745e-05
14
+ 325,1331200000,1.2990926265716554,3.450655307807062,441.6838793993039,2.2012800400261767e-05
15
+ 350,1433600000,1.3015284061431884,3.7156531358282527,475.60360138601635,2.1398400349426083e-05
16
+ 375,1536000000,1.300846767425537,3.9803074113379786,509.47934865126126,2.07840002985904e-05
17
+ 400,1638400000,1.2993078660964965,4.245298594229775,543.3982200614112,2.0169600247754715e-05
18
+ 425,1740800000,1.2959114503860474,4.510613332994379,577.3585066232805,1.955520019691903e-05
19
+ 450,1843200000,1.2931818628311158,4.775614008341895,611.2785930677626,1.8940800146083347e-05
20
+ 475,1945600000,1.2935280084609986,5.040500104048112,645.1840133181583,1.8326400095247664e-05
21
+ 500,2048000000,1.297581434249878,5.305721228775273,679.1323172832349,1.771200004441198e-05
22
+ 525,2150400000,1.2973516607284545,5.571934178461597,713.2075748430844,1.7097599993576296e-05
23
+ 550,2252800000,1.2920558738708496,5.836718603762894,747.0999812816505,1.6483199942740612e-05
24
+ 575,2355200000,1.2921710443496703,6.101621121729093,781.0075035813239,1.5868799891904928e-05
25
+ 600,2457600000,1.2922473382949828,6.3664986443806235,814.9118264807198,1.5254399841069244e-05
26
+ 625,2560000000,1.286066074371338,6.632426781060178,848.9506279757028,1.463999979023356e-05
27
+ 650,2662400000,1.2801355123519897,6.897544031576831,882.8856360418343,1.4025599739397876e-05
28
+ 675,2764800000,1.2844274616241456,7.16281749378081,916.8406392039436,1.3411199688562192e-05
29
+ 700,2867200000,1.2837993860244752,7.427888308341158,950.7697034676683,1.2796799637726508e-05
30
+ 725,2969600000,1.277207851409912,7.694053462864512,984.8388432466576,1.2182399586890824e-05
31
+ 750,3072000000,1.2725739479064941,7.9594902915458245,1018.8147573178655,1.1568000445549842e-05
32
+ 775,3174400000,1.279445676803589,8.224737335718995,1052.7663789720314,1.0953600394714158e-05
33
+ 800,3276800000,1.2785338878631591,8.4897940390995,1086.693637004736,1.0339200343878474e-05
34
+ 825,3379200000,1.2763902473449706,8.754925038348853,1120.6304049086532,9.72480029304279e-06
35
+ 850,3481600000,1.273976821899414,9.020435715932496,1154.6157716393595,9.110400242207106e-06
36
+ 875,3584000000,1.2713893747329712,9.285925992178,1188.598526998784,8.496000191371422e-06
37
+ 900,3686400000,1.2784578609466553,9.551624662507189,1222.6079568009202,7.881600140535738e-06
38
+ 925,3788800000,1.2703066444396973,9.81712425603072,1256.5919047719321,7.267200089700054e-06
39
+ 950,3891200000,1.271108751296997,10.082206830408738,1290.5224742923185,6.6528000388643704e-06
40
+ 975,3993600000,1.2680458974838258,10.347317189690123,1324.4566002803358,6.0383999880286865e-06
41
+ 1000,4096000000,1.2702019023895263,10.612797046162356,1358.4380219087816,5.4239999371930026e-06
42
+ 1025,4198400000,1.2751475191116333,10.87931673718002,1392.5525423590425,4.809599886357319e-06
43
+ 1050,4300800000,1.266355185508728,11.143856210205902,1426.4135949063555,4.195199835521635e-06
44
+ 1075,4403200000,1.2681057167053222,11.408356219250548,1460.2695960640701,3.580800012059626e-06
45
+ 1100,4505600000,1.271327452659607,11.672805533763803,1494.1191083217668,2.9663999612239422e-06
46
+ 1125,4608000000,1.2650820541381835,11.938276334099923,1528.0993707647901,2.3519999103882583e-06
47
+ 1150,4710400000,1.270536971092224,12.203680372205294,1562.0710876422777,1.737599973239412e-06
48
+ 1175,4812800000,1.2631586408615112,12.468635068036095,1595.9852887086201,1.1232000360905658e-06
49
+ 1200,4915200000,1.2647824430465697,12.734026105942206,1629.9553415606024,5.087999852548819e-07
50
+ 1220,4997120000,1.2646446466445922,12.946103368596539,1657.101231180357,1.727999965339677e-08
metadata/training_logs/3_annealing/events.out.tfevents.1734352939.jzxh040.1110305.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e67c420a23acc8902f5b1ef57e8baa6c7ffcd5c15b90783828bfaf156e1219a1
3
+ size 3357018
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc8b337f3bd69430af103f927c1d838d29c158bd29bdfdc12f69405b37e49441
3
+ size 4924315872
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efe1c9aaf6bb27991347d4f1fa47eb53ba71ef4163db6b1d5491c48690626b9a
3
+ size 4983047384
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5117b407cc109ad9d83616ae7cad460566421acc167668c2c992fc073ff4c113
3
+ size 3506598760
model.safetensors.index.json ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 13413924864
4
+ },
5
+ "weight_map": {
6
+ "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
7
+ "model.norm.weight": "model-00001-of-00003.safetensors",
8
+ "lm_head.weight": "model-00001-of-00003.safetensors",
9
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
10
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
11
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
12
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
13
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
14
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
15
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
16
+ "model.layers.0.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
17
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
18
+ "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
19
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
20
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
21
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
22
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
23
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
24
+ "model.layers.1.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
25
+ "model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
26
+ "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
27
+ "model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
28
+ "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
29
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
30
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
31
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
32
+ "model.layers.2.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
33
+ "model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
34
+ "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
35
+ "model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
36
+ "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
37
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
38
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
39
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
40
+ "model.layers.3.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
41
+ "model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
42
+ "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
43
+ "model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
44
+ "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
45
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
46
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
47
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
48
+ "model.layers.4.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
49
+ "model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
50
+ "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
51
+ "model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
52
+ "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
53
+ "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
54
+ "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
55
+ "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
56
+ "model.layers.5.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
57
+ "model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
58
+ "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
59
+ "model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
60
+ "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
61
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
62
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
63
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
64
+ "model.layers.6.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
65
+ "model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
66
+ "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
67
+ "model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
68
+ "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
69
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
70
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
71
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
72
+ "model.layers.7.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
73
+ "model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
74
+ "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
75
+ "model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
76
+ "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
77
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
78
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
79
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
80
+ "model.layers.8.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
81
+ "model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
82
+ "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
83
+ "model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
84
+ "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
85
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
86
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
87
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
88
+ "model.layers.9.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
89
+ "model.layers.10.input_layernorm.weight": "model-00001-of-00003.safetensors",
90
+ "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
91
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
92
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
93
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
94
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
95
+ "model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
96
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
97
+ "model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
98
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
99
+ "model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
100
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
101
+ "model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
102
+ "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
103
+ "model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
104
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
105
+ "model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
106
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
107
+ "model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
108
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
109
+ "model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
110
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
111
+ "model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
112
+ "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
113
+ "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
114
+ "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
115
+ "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
116
+ "model.layers.10.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
117
+ "model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
118
+ "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
119
+ "model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
120
+ "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
121
+ "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
122
+ "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
123
+ "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
124
+ "model.layers.11.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
125
+ "model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
126
+ "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
127
+ "model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
128
+ "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
129
+ "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
130
+ "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
131
+ "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
132
+ "model.layers.12.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
133
+ "model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
134
+ "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
135
+ "model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
136
+ "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
137
+ "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
138
+ "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
139
+ "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
140
+ "model.layers.13.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
141
+ "model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
142
+ "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
143
+ "model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
144
+ "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
145
+ "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
146
+ "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
147
+ "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
148
+ "model.layers.14.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
149
+ "model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
150
+ "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
151
+ "model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
152
+ "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
153
+ "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
154
+ "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
155
+ "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
156
+ "model.layers.15.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
157
+ "model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
158
+ "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
159
+ "model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
160
+ "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
161
+ "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
162
+ "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
163
+ "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
164
+ "model.layers.16.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
165
+ "model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
166
+ "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
167
+ "model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
168
+ "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
169
+ "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
170
+ "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
171
+ "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
172
+ "model.layers.17.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
173
+ "model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
174
+ "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
175
+ "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
176
+ "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
177
+ "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
178
+ "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
179
+ "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
180
+ "model.layers.18.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
181
+ "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
182
+ "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
183
+ "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
184
+ "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
185
+ "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
186
+ "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
187
+ "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
188
+ "model.layers.19.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
189
+ "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
190
+ "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
191
+ "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
192
+ "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
193
+ "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
194
+ "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
195
+ "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
196
+ "model.layers.20.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
197
+ "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
198
+ "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
199
+ "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
200
+ "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
201
+ "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
202
+ "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
203
+ "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
204
+ "model.layers.21.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
205
+ "model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors",
206
+ "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
207
+ "model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
208
+ "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
209
+ "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
210
+ "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
211
+ "model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
212
+ "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
213
+ "model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
214
+ "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
215
+ "model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
216
+ "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
217
+ "model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
218
+ "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
219
+ "model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
220
+ "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
221
+ "model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
222
+ "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
223
+ "model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
224
+ "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
225
+ "model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
226
+ "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
227
+ "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
228
+ "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
229
+ "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
230
+ "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
231
+ "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
232
+ "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
233
+ "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
234
+ "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
235
+ "model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
236
+ "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
237
+ "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
238
+ "model.layers.22.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
239
+ "model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
240
+ "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
241
+ "model.layers.23.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
242
+ "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
243
+ "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
244
+ "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
245
+ "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
246
+ "model.layers.23.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
247
+ "model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
248
+ "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
249
+ "model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
250
+ "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
251
+ "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
252
+ "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
253
+ "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
254
+ "model.layers.24.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
255
+ "model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
256
+ "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
257
+ "model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
258
+ "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
259
+ "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
260
+ "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
261
+ "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
262
+ "model.layers.25.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
263
+ "model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
264
+ "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
265
+ "model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
266
+ "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
267
+ "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
268
+ "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
269
+ "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
270
+ "model.layers.26.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
271
+ "model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
272
+ "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
273
+ "model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
274
+ "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
275
+ "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
276
+ "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
277
+ "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
278
+ "model.layers.27.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
279
+ "model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
280
+ "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
281
+ "model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
282
+ "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
283
+ "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
284
+ "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
285
+ "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
286
+ "model.layers.28.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
287
+ "model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
288
+ "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
289
+ "model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
290
+ "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
291
+ "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
292
+ "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
293
+ "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
294
+ "model.layers.29.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
295
+ "model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
296
+ "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
297
+ "model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
298
+ "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
299
+ "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
300
+ "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
301
+ "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
302
+ "model.layers.30.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
303
+ "model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
304
+ "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
305
+ "model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
306
+ "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
307
+ "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
308
+ "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
309
+ "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
310
+ "model.layers.31.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
311
+ "model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
312
+ "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
313
+ "model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
314
+ "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
315
+ "model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
316
+ "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
317
+ "model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
318
+ "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
319
+ "model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
320
+ "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
321
+ "model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
322
+ "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
323
+ "model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
324
+ "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
325
+ "model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
326
+ "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
327
+ "model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
328
+ "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors"
329
+ }
330
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<pad>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "</s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "<unk>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "3": {
30
+ "content": "<pad>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ }
37
+ },
38
+ "additional_special_tokens": [],
39
+ "bos_token": "<s>",
40
+ "clean_up_tokenization_spaces": false,
41
+ "eos_token": "</s>",
42
+ "legacy": true,
43
+ "model_max_length": 1000000000000000000000000000000,
44
+ "pad_token": "<pad>",
45
+ "sp_model_kwargs": {},
46
+ "spaces_between_special_tokens": false,
47
+ "tokenizer_class": "LlamaTokenizer",
48
+ "unk_token": "<unk>",
49
+ "use_default_system_prompt": false,
50
+ "chat_template": "{{- bos_token }}\n{%- for message in messages %}\n {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }}\n{%- endif %}"
51
+ }