skfrost19 commited on
Commit
17b7c8a
·
verified ·
1 Parent(s): 5bcd077

Add new CrossEncoder model

Browse files
Files changed (6) hide show
  1. README.md +484 -0
  2. config.json +56 -0
  3. model.safetensors +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +945 -0
README.md ADDED
@@ -0,0 +1,484 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - cross-encoder
7
+ - generated_from_trainer
8
+ - dataset_size:1990000
9
+ - loss:BinaryCrossEntropyLoss
10
+ base_model: answerdotai/ModernBERT-base
11
+ datasets:
12
+ - sentence-transformers/msmarco
13
+ pipeline_tag: text-ranking
14
+ library_name: sentence-transformers
15
+ metrics:
16
+ - map
17
+ - mrr@10
18
+ - ndcg@10
19
+ model-index:
20
+ - name: CrossEncoder based on answerdotai/ModernBERT-base
21
+ results:
22
+ - task:
23
+ type: cross-encoder-reranking
24
+ name: Cross Encoder Reranking
25
+ dataset:
26
+ name: NanoMSMARCO R100
27
+ type: NanoMSMARCO_R100
28
+ metrics:
29
+ - type: map
30
+ value: 0.6611
31
+ name: Map
32
+ - type: mrr@10
33
+ value: 0.6577
34
+ name: Mrr@10
35
+ - type: ndcg@10
36
+ value: 0.7254
37
+ name: Ndcg@10
38
+ - task:
39
+ type: cross-encoder-reranking
40
+ name: Cross Encoder Reranking
41
+ dataset:
42
+ name: NanoNFCorpus R100
43
+ type: NanoNFCorpus_R100
44
+ metrics:
45
+ - type: map
46
+ value: 0.3144
47
+ name: Map
48
+ - type: mrr@10
49
+ value: 0.5085
50
+ name: Mrr@10
51
+ - type: ndcg@10
52
+ value: 0.3421
53
+ name: Ndcg@10
54
+ - task:
55
+ type: cross-encoder-reranking
56
+ name: Cross Encoder Reranking
57
+ dataset:
58
+ name: NanoNQ R100
59
+ type: NanoNQ_R100
60
+ metrics:
61
+ - type: map
62
+ value: 0.6828
63
+ name: Map
64
+ - type: mrr@10
65
+ value: 0.7167
66
+ name: Mrr@10
67
+ - type: ndcg@10
68
+ value: 0.7314
69
+ name: Ndcg@10
70
+ - task:
71
+ type: cross-encoder-nano-beir
72
+ name: Cross Encoder Nano BEIR
73
+ dataset:
74
+ name: NanoBEIR R100 mean
75
+ type: NanoBEIR_R100_mean
76
+ metrics:
77
+ - type: map
78
+ value: 0.5527
79
+ name: Map
80
+ - type: mrr@10
81
+ value: 0.6276
82
+ name: Mrr@10
83
+ - type: ndcg@10
84
+ value: 0.5996
85
+ name: Ndcg@10
86
+ ---
87
+
88
+ # CrossEncoder based on answerdotai/ModernBERT-base
89
+
90
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
91
+
92
+ ## Model Details
93
+
94
+ ### Model Description
95
+ - **Model Type:** Cross Encoder
96
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
97
+ - **Maximum Sequence Length:** 8192 tokens
98
+ - **Number of Output Labels:** 1 label
99
+ - **Training Dataset:**
100
+ - [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco)
101
+ - **Language:** en
102
+ <!-- - **License:** Unknown -->
103
+
104
+ ### Model Sources
105
+
106
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
107
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
108
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
109
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
110
+
111
+ ## Usage
112
+
113
+ ### Direct Usage (Sentence Transformers)
114
+
115
+ First install the Sentence Transformers library:
116
+
117
+ ```bash
118
+ pip install -U sentence-transformers
119
+ ```
120
+
121
+ Then you can load this model and run inference.
122
+ ```python
123
+ from sentence_transformers import CrossEncoder
124
+
125
+ # Download from the 🤗 Hub
126
+ model = CrossEncoder("skfrost19/reranker-ModernBERT-base-msmarco-bce-AdamW.Cosine-ep-1-3")
127
+ # Get scores for pairs of texts
128
+ pairs = [
129
+ ['what symptoms might a patient with a tmd have', 'TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinsonâ\x80\x99s disease (PD) symptoms.'],
130
+ ['what is a thermal protector', 'The word hero comes from the Greek á¼¥Ï\x81Ï\x89Ï\x82 (hÄ\x93rÅ\x8ds), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender.'],
131
+ ['how many copies of call of duty wwii sold', 'Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360.'],
132
+ ['what is the desired temperature for the fresh food compartment in a refrigerator', 'A refrigerator maintains a temperature a few degrees above the freezing point of water. Optimum temperature range for perishable food storage is 3 to 5 °C (37 to 41 °F).emperature settings for refrigerator and freezer compartments are often given arbitrary numbers by manufacturers (for example, 1 through 9, warmest to coldest), but generally 3 to 5 °C (37 to 41 °F) is ideal for the refrigerator compartment and â\x88\x9218 °C (0 °F) for the freezer.'],
133
+ ['what is gsm alarm system', 'Iâ\x80\x99m sure you would have these questions in your mind when you heard GSM alarm system at the first time. GSM alarm system is an alarm system that operating through GSM (global system for mobile communications) network; not requiring a telephone line.urthermore, in the case of burglar entering the premises and cutting the telephone line, the GSM alarm would not be affected and still work as it does not require the use of a fixed phone line. So this security alarm is ideal for the place where no fixed phone line or hard to get one.'],
134
+ ]
135
+ scores = model.predict(pairs)
136
+ print(scores.shape)
137
+ # (5,)
138
+
139
+ # Or rank different texts based on similarity to a single text
140
+ ranks = model.rank(
141
+ 'what symptoms might a patient with a tmd have',
142
+ [
143
+ 'TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinsonâ\x80\x99s disease (PD) symptoms.',
144
+ 'The word hero comes from the Greek á¼¥Ï\x81Ï\x89Ï\x82 (hÄ\x93rÅ\x8ds), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender.',
145
+ 'Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360.',
146
+ 'A refrigerator maintains a temperature a few degrees above the freezing point of water. Optimum temperature range for perishable food storage is 3 to 5 °C (37 to 41 °F).emperature settings for refrigerator and freezer compartments are often given arbitrary numbers by manufacturers (for example, 1 through 9, warmest to coldest), but generally 3 to 5 °C (37 to 41 °F) is ideal for the refrigerator compartment and â\x88\x9218 °C (0 °F) for the freezer.',
147
+ 'Iâ\x80\x99m sure you would have these questions in your mind when you heard GSM alarm system at the first time. GSM alarm system is an alarm system that operating through GSM (global system for mobile communications) network; not requiring a telephone line.urthermore, in the case of burglar entering the premises and cutting the telephone line, the GSM alarm would not be affected and still work as it does not require the use of a fixed phone line. So this security alarm is ideal for the place where no fixed phone line or hard to get one.',
148
+ ]
149
+ )
150
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
151
+ ```
152
+
153
+ <!--
154
+ ### Direct Usage (Transformers)
155
+
156
+ <details><summary>Click to see the direct usage in Transformers</summary>
157
+
158
+ </details>
159
+ -->
160
+
161
+ <!--
162
+ ### Downstream Usage (Sentence Transformers)
163
+
164
+ You can finetune this model on your own dataset.
165
+
166
+ <details><summary>Click to expand</summary>
167
+
168
+ </details>
169
+ -->
170
+
171
+ <!--
172
+ ### Out-of-Scope Use
173
+
174
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
175
+ -->
176
+
177
+ ## Evaluation
178
+
179
+ ### Metrics
180
+
181
+ #### Cross Encoder Reranking
182
+
183
+ * Datasets: `NanoMSMARCO_R100`, `NanoNFCorpus_R100` and `NanoNQ_R100`
184
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
185
+ ```json
186
+ {
187
+ "at_k": 10,
188
+ "always_rerank_positives": true
189
+ }
190
+ ```
191
+
192
+ | Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
193
+ |:------------|:---------------------|:---------------------|:---------------------|
194
+ | map | 0.6611 (+0.1715) | 0.3144 (+0.0534) | 0.6828 (+0.2632) |
195
+ | mrr@10 | 0.6577 (+0.1802) | 0.5085 (+0.0087) | 0.7167 (+0.2900) |
196
+ | **ndcg@10** | **0.7254 (+0.1850)** | **0.3421 (+0.0171)** | **0.7314 (+0.2308)** |
197
+
198
+ #### Cross Encoder Nano BEIR
199
+
200
+ * Dataset: `NanoBEIR_R100_mean`
201
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
202
+ ```json
203
+ {
204
+ "dataset_names": [
205
+ "msmarco",
206
+ "nfcorpus",
207
+ "nq"
208
+ ],
209
+ "rerank_k": 100,
210
+ "at_k": 10,
211
+ "always_rerank_positives": true
212
+ }
213
+ ```
214
+
215
+ | Metric | Value |
216
+ |:------------|:---------------------|
217
+ | map | 0.5527 (+0.1627) |
218
+ | mrr@10 | 0.6276 (+0.1596) |
219
+ | **ndcg@10** | **0.5996 (+0.1443)** |
220
+
221
+ <!--
222
+ ## Bias, Risks and Limitations
223
+
224
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
225
+ -->
226
+
227
+ <!--
228
+ ### Recommendations
229
+
230
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
231
+ -->
232
+
233
+ ## Training Details
234
+
235
+ ### Training Dataset
236
+
237
+ #### msmarco
238
+
239
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [9e329ed](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/9e329ed2e649c9d37b0d91dd6b764ff6fe671d83)
240
+ * Size: 1,990,000 training samples
241
+ * Columns: <code>query</code>, <code>passage</code>, and <code>score</code>
242
+ * Approximate statistics based on the first 1000 samples:
243
+ | | query | passage | score |
244
+ |:--------|:------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
245
+ | type | string | string | float |
246
+ | details | <ul><li>min: 11 characters</li><li>mean: 34.61 characters</li><li>max: 124 characters</li></ul> | <ul><li>min: 82 characters</li><li>mean: 357.43 characters</li><li>max: 1034 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.49</li><li>max: 1.0</li></ul> |
247
+ * Samples:
248
+ | query | passage | score |
249
+ |:---------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
250
+ | <code>what causes your tailbone to hurt</code> | <code>A coccyx injury results in pain and discomfort in the tailbone area (the condition is called coccydynia). These injuries may result in a bruise, dislocation, or fracture (break) of the coccyx. Although they may be slow to heal, the majority of coccyx injuries can be managed with cautious treatment.ost tailbone injuries are caused by trauma to the coccyx area. 1 A fall onto the tailbone in the seated position, usually against a hard surface, is the most common cause of coccyx injuries. 2 A direct blow to the tailbone, such as those that occur during contact sports, can injure the coccyx.</code> | <code>1.0</code> |
251
+ | <code>what muscles do trunk lateral flexion</code> | <code>It’s the same with the External Obliques, but unlike the External Obliques, they are not visible when fully developed. Action: 1 Supports abdominal wall, assists forced respiration, aids raising intra-abdominal pressure and, with muscles of other side, abducts and rotates trunk. 2 Contraction of one side alone laterally bends the trunk to that side and rotates the trunk to the other side.</code> | <code>0.0</code> |
252
+ | <code>brake horsepower definition</code> | <code>When the brake lights will not come on, the first thing to check is the third-brake light. If it too is not working, the brake-light switch, a bad fuse or an unplugged harness is likely.ull up on the brake pedal and if the lights go out, switch mis-alignment or pedal position error is the likely cause. The final possibility is a wire shorted to power. Unplug the brake-light switch and if the lights stay on, a short circuit is the case.</code> | <code>0.0</code> |
253
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
254
+ ```json
255
+ {
256
+ "activation_fn": "torch.nn.modules.linear.Identity",
257
+ "pos_weight": null
258
+ }
259
+ ```
260
+
261
+ ### Evaluation Dataset
262
+
263
+ #### msmarco
264
+
265
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [9e329ed](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/9e329ed2e649c9d37b0d91dd6b764ff6fe671d83)
266
+ * Size: 10,000 evaluation samples
267
+ * Columns: <code>query</code>, <code>passage</code>, and <code>score</code>
268
+ * Approximate statistics based on the first 1000 samples:
269
+ | | query | passage | score |
270
+ |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:--------------------------------------------------------------|
271
+ | type | string | string | float |
272
+ | details | <ul><li>min: 9 characters</li><li>mean: 33.72 characters</li><li>max: 193 characters</li></ul> | <ul><li>min: 55 characters</li><li>mean: 353.35 characters</li><li>max: 895 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
273
+ * Samples:
274
+ | query | passage | score |
275
+ |:-----------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
276
+ | <code>what symptoms might a patient with a tmd have</code> | <code>TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinson’s disease (PD) symptoms.</code> | <code>1.0</code> |
277
+ | <code>what is a thermal protector</code> | <code>The word hero comes from the Greek ἥρως (hērōs), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender.</code> | <code>0.0</code> |
278
+ | <code>how many copies of call of duty wwii sold</code> | <code>Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360.</code> | <code>0.0</code> |
279
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
280
+ ```json
281
+ {
282
+ "activation_fn": "torch.nn.modules.linear.Identity",
283
+ "pos_weight": null
284
+ }
285
+ ```
286
+
287
+ ### Training Hyperparameters
288
+ #### Non-Default Hyperparameters
289
+
290
+ - `eval_strategy`: epoch
291
+ - `per_device_train_batch_size`: 64
292
+ - `per_device_eval_batch_size`: 64
293
+ - `learning_rate`: 2e-05
294
+ - `warmup_ratio`: 0.1
295
+ - `seed`: 12
296
+ - `bf16`: True
297
+ - `dataloader_num_workers`: 4
298
+ - `load_best_model_at_end`: True
299
+ - `resume_from_checkpoint`: True
300
+
301
+ #### All Hyperparameters
302
+ <details><summary>Click to expand</summary>
303
+
304
+ - `overwrite_output_dir`: False
305
+ - `do_predict`: False
306
+ - `eval_strategy`: epoch
307
+ - `prediction_loss_only`: True
308
+ - `per_device_train_batch_size`: 64
309
+ - `per_device_eval_batch_size`: 64
310
+ - `per_gpu_train_batch_size`: None
311
+ - `per_gpu_eval_batch_size`: None
312
+ - `gradient_accumulation_steps`: 1
313
+ - `eval_accumulation_steps`: None
314
+ - `torch_empty_cache_steps`: None
315
+ - `learning_rate`: 2e-05
316
+ - `weight_decay`: 0.0
317
+ - `adam_beta1`: 0.9
318
+ - `adam_beta2`: 0.999
319
+ - `adam_epsilon`: 1e-08
320
+ - `max_grad_norm`: 1.0
321
+ - `num_train_epochs`: 3
322
+ - `max_steps`: -1
323
+ - `lr_scheduler_type`: linear
324
+ - `lr_scheduler_kwargs`: {}
325
+ - `warmup_ratio`: 0.1
326
+ - `warmup_steps`: 0
327
+ - `log_level`: passive
328
+ - `log_level_replica`: warning
329
+ - `log_on_each_node`: True
330
+ - `logging_nan_inf_filter`: True
331
+ - `save_safetensors`: True
332
+ - `save_on_each_node`: False
333
+ - `save_only_model`: False
334
+ - `restore_callback_states_from_checkpoint`: False
335
+ - `no_cuda`: False
336
+ - `use_cpu`: False
337
+ - `use_mps_device`: False
338
+ - `seed`: 12
339
+ - `data_seed`: None
340
+ - `jit_mode_eval`: False
341
+ - `use_ipex`: False
342
+ - `bf16`: True
343
+ - `fp16`: False
344
+ - `fp16_opt_level`: O1
345
+ - `half_precision_backend`: auto
346
+ - `bf16_full_eval`: False
347
+ - `fp16_full_eval`: False
348
+ - `tf32`: None
349
+ - `local_rank`: 2
350
+ - `ddp_backend`: None
351
+ - `tpu_num_cores`: None
352
+ - `tpu_metrics_debug`: False
353
+ - `debug`: []
354
+ - `dataloader_drop_last`: True
355
+ - `dataloader_num_workers`: 4
356
+ - `dataloader_prefetch_factor`: None
357
+ - `past_index`: -1
358
+ - `disable_tqdm`: False
359
+ - `remove_unused_columns`: True
360
+ - `label_names`: None
361
+ - `load_best_model_at_end`: True
362
+ - `ignore_data_skip`: False
363
+ - `fsdp`: []
364
+ - `fsdp_min_num_params`: 0
365
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
366
+ - `tp_size`: 0
367
+ - `fsdp_transformer_layer_cls_to_wrap`: None
368
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
369
+ - `deepspeed`: None
370
+ - `label_smoothing_factor`: 0.0
371
+ - `optim`: adamw_torch
372
+ - `optim_args`: None
373
+ - `adafactor`: False
374
+ - `group_by_length`: False
375
+ - `length_column_name`: length
376
+ - `ddp_find_unused_parameters`: None
377
+ - `ddp_bucket_cap_mb`: None
378
+ - `ddp_broadcast_buffers`: False
379
+ - `dataloader_pin_memory`: True
380
+ - `dataloader_persistent_workers`: False
381
+ - `skip_memory_metrics`: True
382
+ - `use_legacy_prediction_loop`: False
383
+ - `push_to_hub`: False
384
+ - `resume_from_checkpoint`: True
385
+ - `hub_model_id`: None
386
+ - `hub_strategy`: every_save
387
+ - `hub_private_repo`: None
388
+ - `hub_always_push`: False
389
+ - `gradient_checkpointing`: False
390
+ - `gradient_checkpointing_kwargs`: None
391
+ - `include_inputs_for_metrics`: False
392
+ - `include_for_metrics`: []
393
+ - `eval_do_concat_batches`: True
394
+ - `fp16_backend`: auto
395
+ - `push_to_hub_model_id`: None
396
+ - `push_to_hub_organization`: None
397
+ - `mp_parameters`:
398
+ - `auto_find_batch_size`: False
399
+ - `full_determinism`: False
400
+ - `torchdynamo`: None
401
+ - `ray_scope`: last
402
+ - `ddp_timeout`: 1800
403
+ - `torch_compile`: False
404
+ - `torch_compile_backend`: None
405
+ - `torch_compile_mode`: None
406
+ - `dispatch_batches`: None
407
+ - `split_batches`: None
408
+ - `include_tokens_per_second`: False
409
+ - `include_num_input_tokens_seen`: False
410
+ - `neftune_noise_alpha`: None
411
+ - `optim_target_modules`: None
412
+ - `batch_eval_metrics`: False
413
+ - `eval_on_start`: False
414
+ - `use_liger_kernel`: False
415
+ - `eval_use_gather_object`: False
416
+ - `average_tokens_across_devices`: False
417
+ - `prompts`: None
418
+ - `batch_sampler`: batch_sampler
419
+ - `multi_dataset_batch_sampler`: proportional
420
+
421
+ </details>
422
+
423
+ ### Training Logs
424
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
425
+ |:-------:|:---------:|:-------------:|:---------------:|:------------------------:|:-------------------------:|:--------------------:|:--------------------------:|
426
+ | -1 | -1 | - | - | 0.0186 (-0.5218) | 0.2929 (-0.0321) | 0.0429 (-0.4577) | 0.1182 (-0.3372) |
427
+ | 0.0001 | 1 | 0.7403 | - | - | - | - | - |
428
+ | 0.3860 | 4000 | 0.201 | - | - | - | - | - |
429
+ | 0.7719 | 8000 | 0.1544 | - | - | - | - | - |
430
+ | 1.0 | 10364 | - | 0.1478 | 0.7029 (+0.1625) | 0.3798 (+0.0548) | 0.7394 (+0.2388) | 0.6074 (+0.1520) |
431
+ | 1.1579 | 12000 | 0.1364 | - | - | - | - | - |
432
+ | 1.5438 | 16000 | 0.1227 | - | - | - | - | - |
433
+ | 1.9298 | 20000 | 0.1173 | - | - | - | - | - |
434
+ | **2.0** | **20728** | **-** | **0.1297** | **0.7089 (+0.1685)** | **0.3785 (+0.0535)** | **0.7382 (+0.2375)** | **0.6085 (+0.1532)** |
435
+ | 2.3157 | 24000 | 0.1014 | - | - | - | - | - |
436
+ | 2.7017 | 28000 | 0.0969 | - | - | - | - | - |
437
+ | 3.0 | 31092 | - | 0.1195 | 0.6846 (+0.1442) | 0.3906 (+0.0655) | 0.7433 (+0.2426) | 0.6062 (+0.1508) |
438
+ | -1 | -1 | - | - | 0.7254 (+0.1850) | 0.3421 (+0.0171) | 0.7314 (+0.2308) | 0.5996 (+0.1443) |
439
+
440
+ * The bold row denotes the saved checkpoint.
441
+
442
+ ### Framework Versions
443
+ - Python: 3.11.5
444
+ - Sentence Transformers: 4.0.1
445
+ - Transformers: 4.50.3
446
+ - PyTorch: 2.6.0+cu124
447
+ - Accelerate: 1.6.0
448
+ - Datasets: 3.5.0
449
+ - Tokenizers: 0.21.1
450
+
451
+ ## Citation
452
+
453
+ ### BibTeX
454
+
455
+ #### Sentence Transformers
456
+ ```bibtex
457
+ @inproceedings{reimers-2019-sentence-bert,
458
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
459
+ author = "Reimers, Nils and Gurevych, Iryna",
460
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
461
+ month = "11",
462
+ year = "2019",
463
+ publisher = "Association for Computational Linguistics",
464
+ url = "https://arxiv.org/abs/1908.10084",
465
+ }
466
+ ```
467
+
468
+ <!--
469
+ ## Glossary
470
+
471
+ *Clearly define terms in order to be accessible across audiences.*
472
+ -->
473
+
474
+ <!--
475
+ ## Model Card Authors
476
+
477
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
478
+ -->
479
+
480
+ <!--
481
+ ## Model Card Contact
482
+
483
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
484
+ -->
config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertForSequenceClassification"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "id2label": {
23
+ "0": "LABEL_0"
24
+ },
25
+ "initializer_cutoff_factor": 2.0,
26
+ "initializer_range": 0.02,
27
+ "intermediate_size": 1152,
28
+ "label2id": {
29
+ "LABEL_0": 0
30
+ },
31
+ "layer_norm_eps": 1e-05,
32
+ "local_attention": 128,
33
+ "local_rope_theta": 10000.0,
34
+ "max_position_embeddings": 8192,
35
+ "mlp_bias": false,
36
+ "mlp_dropout": 0.0,
37
+ "model_type": "modernbert",
38
+ "norm_bias": false,
39
+ "norm_eps": 1e-05,
40
+ "num_attention_heads": 12,
41
+ "num_hidden_layers": 22,
42
+ "pad_token_id": 50283,
43
+ "position_embedding_type": "absolute",
44
+ "reference_compile": true,
45
+ "repad_logits_with_grad": false,
46
+ "sentence_transformers": {
47
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
48
+ "version": "4.0.1"
49
+ },
50
+ "sep_token_id": 50282,
51
+ "sparse_pred_ignore_index": -100,
52
+ "sparse_prediction": false,
53
+ "torch_dtype": "float32",
54
+ "transformers_version": "4.50.3",
55
+ "vocab_size": 50368
56
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a54a43d13352d4e0b9a3f014dd89af63d4f8653cd936325472d606565bbbfbd4
3
+ size 598436708
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizer",
944
+ "unk_token": "[UNK]"
945
+ }