tomaarsen HF Staff commited on
Commit
bb66f45
·
verified ·
1 Parent(s): 3a93583

Add new CrossEncoder model

Browse files
Files changed (6) hide show
  1. README.md +514 -0
  2. config.json +56 -0
  3. model.safetensors +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +945 -0
README.md ADDED
@@ -0,0 +1,514 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - cross-encoder
8
+ - generated_from_trainer
9
+ - dataset_size:482388
10
+ - loss:BinaryCrossEntropyLoss
11
+ base_model: answerdotai/ModernBERT-base
12
+ pipeline_tag: text-ranking
13
+ library_name: sentence-transformers
14
+ metrics:
15
+ - map
16
+ - mrr@10
17
+ - ndcg@10
18
+ model-index:
19
+ - name: ModernBERT-base trained on GooAQ
20
+ results:
21
+ - task:
22
+ type: cross-encoder-reranking
23
+ name: Cross Encoder Reranking
24
+ dataset:
25
+ name: gooaq dev
26
+ type: gooaq-dev
27
+ metrics:
28
+ - type: map
29
+ value: 0.7089
30
+ name: Map
31
+ - type: mrr@10
32
+ value: 0.7076
33
+ name: Mrr@10
34
+ - type: ndcg@10
35
+ value: 0.755
36
+ name: Ndcg@10
37
+ - task:
38
+ type: cross-encoder-reranking
39
+ name: Cross Encoder Reranking
40
+ dataset:
41
+ name: NanoMSMARCO R100
42
+ type: NanoMSMARCO_R100
43
+ metrics:
44
+ - type: map
45
+ value: 0.554
46
+ name: Map
47
+ - type: mrr@10
48
+ value: 0.5472
49
+ name: Mrr@10
50
+ - type: ndcg@10
51
+ value: 0.6229
52
+ name: Ndcg@10
53
+ - task:
54
+ type: cross-encoder-reranking
55
+ name: Cross Encoder Reranking
56
+ dataset:
57
+ name: NanoNFCorpus R100
58
+ type: NanoNFCorpus_R100
59
+ metrics:
60
+ - type: map
61
+ value: 0.3421
62
+ name: Map
63
+ - type: mrr@10
64
+ value: 0.5284
65
+ name: Mrr@10
66
+ - type: ndcg@10
67
+ value: 0.3792
68
+ name: Ndcg@10
69
+ - task:
70
+ type: cross-encoder-reranking
71
+ name: Cross Encoder Reranking
72
+ dataset:
73
+ name: NanoNQ R100
74
+ type: NanoNQ_R100
75
+ metrics:
76
+ - type: map
77
+ value: 0.6312
78
+ name: Map
79
+ - type: mrr@10
80
+ value: 0.638
81
+ name: Mrr@10
82
+ - type: ndcg@10
83
+ value: 0.6915
84
+ name: Ndcg@10
85
+ - task:
86
+ type: cross-encoder-nano-beir
87
+ name: Cross Encoder Nano BEIR
88
+ dataset:
89
+ name: NanoBEIR R100 mean
90
+ type: NanoBEIR_R100_mean
91
+ metrics:
92
+ - type: map
93
+ value: 0.5091
94
+ name: Map
95
+ - type: mrr@10
96
+ value: 0.5712
97
+ name: Mrr@10
98
+ - type: ndcg@10
99
+ value: 0.5645
100
+ name: Ndcg@10
101
+ ---
102
+
103
+ # ModernBERT-base trained on GooAQ
104
+
105
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
106
+
107
+ ## Model Details
108
+
109
+ ### Model Description
110
+ - **Model Type:** Cross Encoder
111
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
112
+ - **Maximum Sequence Length:** 8192 tokens
113
+ - **Number of Output Labels:** 1 label
114
+ <!-- - **Training Dataset:** Unknown -->
115
+ - **Language:** en
116
+ - **License:** apache-2.0
117
+
118
+ ### Model Sources
119
+
120
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
121
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
122
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
123
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
124
+
125
+ ## Usage
126
+
127
+ ### Direct Usage (Sentence Transformers)
128
+
129
+ First install the Sentence Transformers library:
130
+
131
+ ```bash
132
+ pip install -U sentence-transformers
133
+ ```
134
+
135
+ Then you can load this model and run inference.
136
+ ```python
137
+ from sentence_transformers import CrossEncoder
138
+
139
+ # Download from the 🤗 Hub
140
+ model = CrossEncoder("tomaarsen/reranker-ModernBERT-base-gooaq-bce-soft-negs")
141
+ # Get scores for pairs of texts
142
+ pairs = [
143
+ ['what is the difference between ground level ozone and the ozone layer?', 'Here, ground-level or "bad" ozone is an air pollutant that is harmful to breathe and it damages crops, trees and other vegetation. ... The stratosphere or "good" ozone layer extends upward from about 6 to 30 miles and protects life on Earth from the sun\'s harmful ultraviolet (UV) rays.'],
144
+ ['what is the difference between ground level ozone and the ozone layer?', 'In the stratosphere, temperature increases with altitude. The reason is that the direct heat source for the stratosphere is the Sun. A layer of ozone molecules absorbs solar radiation, which heats the stratosphere.'],
145
+ ['what is the difference between ground level ozone and the ozone layer?', "Atmosphere layers. Earth's atmosphere is divided into five main layers: the exosphere, the thermosphere, the mesosphere, the stratosphere and the troposphere. ... Ozone is abundant here and it heats the atmosphere while also absorbing harmful radiation from the sun."],
146
+ ['what is the difference between ground level ozone and the ozone layer?', "['Water vapor (H. 2O)', 'Carbon dioxide (CO. ... ', 'Methane (CH. ... ', 'Nitrous oxide (N. 2O)', 'Ozone (O. ... ', 'Chlorofluorocarbons (CFCs)', 'Hydrofluorocarbons (includes HCFCs and HFCs)']"],
147
+ ['what is the difference between ground level ozone and the ozone layer?', "Gases in the atmosphere, such as carbon dioxide, trap heat just like the glass roof of a greenhouse. These heat-trapping gases are called greenhouse gases. During the day, the Sun shines through the atmosphere. Earth's surface warms up in the sunlight."],
148
+ ]
149
+ scores = model.predict(pairs)
150
+ print(scores.shape)
151
+ # (5,)
152
+
153
+ # Or rank different texts based on similarity to a single text
154
+ ranks = model.rank(
155
+ 'what is the difference between ground level ozone and the ozone layer?',
156
+ [
157
+ 'Here, ground-level or "bad" ozone is an air pollutant that is harmful to breathe and it damages crops, trees and other vegetation. ... The stratosphere or "good" ozone layer extends upward from about 6 to 30 miles and protects life on Earth from the sun\'s harmful ultraviolet (UV) rays.',
158
+ 'In the stratosphere, temperature increases with altitude. The reason is that the direct heat source for the stratosphere is the Sun. A layer of ozone molecules absorbs solar radiation, which heats the stratosphere.',
159
+ "Atmosphere layers. Earth's atmosphere is divided into five main layers: the exosphere, the thermosphere, the mesosphere, the stratosphere and the troposphere. ... Ozone is abundant here and it heats the atmosphere while also absorbing harmful radiation from the sun.",
160
+ "['Water vapor (H. 2O)', 'Carbon dioxide (CO. ... ', 'Methane (CH. ... ', 'Nitrous oxide (N. 2O)', 'Ozone (O. ... ', 'Chlorofluorocarbons (CFCs)', 'Hydrofluorocarbons (includes HCFCs and HFCs)']",
161
+ "Gases in the atmosphere, such as carbon dioxide, trap heat just like the glass roof of a greenhouse. These heat-trapping gases are called greenhouse gases. During the day, the Sun shines through the atmosphere. Earth's surface warms up in the sunlight.",
162
+ ]
163
+ )
164
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
165
+ ```
166
+
167
+ <!--
168
+ ### Direct Usage (Transformers)
169
+
170
+ <details><summary>Click to see the direct usage in Transformers</summary>
171
+
172
+ </details>
173
+ -->
174
+
175
+ <!--
176
+ ### Downstream Usage (Sentence Transformers)
177
+
178
+ You can finetune this model on your own dataset.
179
+
180
+ <details><summary>Click to expand</summary>
181
+
182
+ </details>
183
+ -->
184
+
185
+ <!--
186
+ ### Out-of-Scope Use
187
+
188
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
189
+ -->
190
+
191
+ ## Evaluation
192
+
193
+ ### Metrics
194
+
195
+ #### Cross Encoder Reranking
196
+
197
+ * Dataset: `gooaq-dev`
198
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
199
+ ```json
200
+ {
201
+ "at_k": 10,
202
+ "always_rerank_positives": false
203
+ }
204
+ ```
205
+
206
+ | Metric | Value |
207
+ |:------------|:---------------------|
208
+ | map | 0.7089 (+0.1778) |
209
+ | mrr@10 | 0.7076 (+0.1836) |
210
+ | **ndcg@10** | **0.7550 (+0.1637)** |
211
+
212
+ #### Cross Encoder Reranking
213
+
214
+ * Datasets: `NanoMSMARCO_R100`, `NanoNFCorpus_R100` and `NanoNQ_R100`
215
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
216
+ ```json
217
+ {
218
+ "at_k": 10,
219
+ "always_rerank_positives": true
220
+ }
221
+ ```
222
+
223
+ | Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
224
+ |:------------|:---------------------|:---------------------|:---------------------|
225
+ | map | 0.5540 (+0.0644) | 0.3421 (+0.0811) | 0.6312 (+0.2116) |
226
+ | mrr@10 | 0.5472 (+0.0697) | 0.5284 (+0.0286) | 0.6380 (+0.2113) |
227
+ | **ndcg@10** | **0.6229 (+0.0825)** | **0.3792 (+0.0541)** | **0.6915 (+0.1908)** |
228
+
229
+ #### Cross Encoder Nano BEIR
230
+
231
+ * Dataset: `NanoBEIR_R100_mean`
232
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
233
+ ```json
234
+ {
235
+ "dataset_names": [
236
+ "msmarco",
237
+ "nfcorpus",
238
+ "nq"
239
+ ],
240
+ "rerank_k": 100,
241
+ "at_k": 10,
242
+ "always_rerank_positives": true
243
+ }
244
+ ```
245
+
246
+ | Metric | Value |
247
+ |:------------|:---------------------|
248
+ | map | 0.5091 (+0.1190) |
249
+ | mrr@10 | 0.5712 (+0.1032) |
250
+ | **ndcg@10** | **0.5645 (+0.1092)** |
251
+
252
+ <!--
253
+ ## Bias, Risks and Limitations
254
+
255
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
256
+ -->
257
+
258
+ <!--
259
+ ### Recommendations
260
+
261
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
262
+ -->
263
+
264
+ ## Training Details
265
+
266
+ ### Training Dataset
267
+
268
+ #### Unnamed Dataset
269
+
270
+ * Size: 482,388 training samples
271
+ * Columns: <code>question</code>, <code>answer</code>, and <code>label</code>
272
+ * Approximate statistics based on the first 1000 samples:
273
+ | | question | answer | label |
274
+ |:--------|:----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:------------------------------------------------|
275
+ | type | string | string | int |
276
+ | details | <ul><li>min: 17 characters</li><li>mean: 43.7 characters</li><li>max: 91 characters</li></ul> | <ul><li>min: 53 characters</li><li>mean: 250.44 characters</li><li>max: 393 characters</li></ul> | <ul><li>0: ~79.30%</li><li>1: ~20.70%</li></ul> |
277
+ * Samples:
278
+ | question | answer | label |
279
+ |:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
280
+ | <code>what is the difference between ground level ozone and the ozone layer?</code> | <code>Here, ground-level or "bad" ozone is an air pollutant that is harmful to breathe and it damages crops, trees and other vegetation. ... The stratosphere or "good" ozone layer extends upward from about 6 to 30 miles and protects life on Earth from the sun's harmful ultraviolet (UV) rays.</code> | <code>1</code> |
281
+ | <code>what is the difference between ground level ozone and the ozone layer?</code> | <code>In the stratosphere, temperature increases with altitude. The reason is that the direct heat source for the stratosphere is the Sun. A layer of ozone molecules absorbs solar radiation, which heats the stratosphere.</code> | <code>0</code> |
282
+ | <code>what is the difference between ground level ozone and the ozone layer?</code> | <code>Atmosphere layers. Earth's atmosphere is divided into five main layers: the exosphere, the thermosphere, the mesosphere, the stratosphere and the troposphere. ... Ozone is abundant here and it heats the atmosphere while also absorbing harmful radiation from the sun.</code> | <code>0</code> |
283
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
284
+ ```json
285
+ {
286
+ "activation_fct": "torch.nn.modules.linear.Identity",
287
+ "pos_weight": 5
288
+ }
289
+ ```
290
+
291
+ ### Training Hyperparameters
292
+ #### Non-Default Hyperparameters
293
+
294
+ - `eval_strategy`: steps
295
+ - `per_device_train_batch_size`: 64
296
+ - `per_device_eval_batch_size`: 64
297
+ - `learning_rate`: 2e-05
298
+ - `num_train_epochs`: 1
299
+ - `warmup_ratio`: 0.1
300
+ - `seed`: 12
301
+ - `bf16`: True
302
+ - `dataloader_num_workers`: 4
303
+ - `load_best_model_at_end`: True
304
+
305
+ #### All Hyperparameters
306
+ <details><summary>Click to expand</summary>
307
+
308
+ - `overwrite_output_dir`: False
309
+ - `do_predict`: False
310
+ - `eval_strategy`: steps
311
+ - `prediction_loss_only`: True
312
+ - `per_device_train_batch_size`: 64
313
+ - `per_device_eval_batch_size`: 64
314
+ - `per_gpu_train_batch_size`: None
315
+ - `per_gpu_eval_batch_size`: None
316
+ - `gradient_accumulation_steps`: 1
317
+ - `eval_accumulation_steps`: None
318
+ - `torch_empty_cache_steps`: None
319
+ - `learning_rate`: 2e-05
320
+ - `weight_decay`: 0.0
321
+ - `adam_beta1`: 0.9
322
+ - `adam_beta2`: 0.999
323
+ - `adam_epsilon`: 1e-08
324
+ - `max_grad_norm`: 1.0
325
+ - `num_train_epochs`: 1
326
+ - `max_steps`: -1
327
+ - `lr_scheduler_type`: linear
328
+ - `lr_scheduler_kwargs`: {}
329
+ - `warmup_ratio`: 0.1
330
+ - `warmup_steps`: 0
331
+ - `log_level`: passive
332
+ - `log_level_replica`: warning
333
+ - `log_on_each_node`: True
334
+ - `logging_nan_inf_filter`: True
335
+ - `save_safetensors`: True
336
+ - `save_on_each_node`: False
337
+ - `save_only_model`: False
338
+ - `restore_callback_states_from_checkpoint`: False
339
+ - `no_cuda`: False
340
+ - `use_cpu`: False
341
+ - `use_mps_device`: False
342
+ - `seed`: 12
343
+ - `data_seed`: None
344
+ - `jit_mode_eval`: False
345
+ - `use_ipex`: False
346
+ - `bf16`: True
347
+ - `fp16`: False
348
+ - `fp16_opt_level`: O1
349
+ - `half_precision_backend`: auto
350
+ - `bf16_full_eval`: False
351
+ - `fp16_full_eval`: False
352
+ - `tf32`: None
353
+ - `local_rank`: 0
354
+ - `ddp_backend`: None
355
+ - `tpu_num_cores`: None
356
+ - `tpu_metrics_debug`: False
357
+ - `debug`: []
358
+ - `dataloader_drop_last`: False
359
+ - `dataloader_num_workers`: 4
360
+ - `dataloader_prefetch_factor`: None
361
+ - `past_index`: -1
362
+ - `disable_tqdm`: False
363
+ - `remove_unused_columns`: True
364
+ - `label_names`: None
365
+ - `load_best_model_at_end`: True
366
+ - `ignore_data_skip`: False
367
+ - `fsdp`: []
368
+ - `fsdp_min_num_params`: 0
369
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
370
+ - `fsdp_transformer_layer_cls_to_wrap`: None
371
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
372
+ - `deepspeed`: None
373
+ - `label_smoothing_factor`: 0.0
374
+ - `optim`: adamw_torch
375
+ - `optim_args`: None
376
+ - `adafactor`: False
377
+ - `group_by_length`: False
378
+ - `length_column_name`: length
379
+ - `ddp_find_unused_parameters`: None
380
+ - `ddp_bucket_cap_mb`: None
381
+ - `ddp_broadcast_buffers`: False
382
+ - `dataloader_pin_memory`: True
383
+ - `dataloader_persistent_workers`: False
384
+ - `skip_memory_metrics`: True
385
+ - `use_legacy_prediction_loop`: False
386
+ - `push_to_hub`: False
387
+ - `resume_from_checkpoint`: None
388
+ - `hub_model_id`: None
389
+ - `hub_strategy`: every_save
390
+ - `hub_private_repo`: None
391
+ - `hub_always_push`: False
392
+ - `gradient_checkpointing`: False
393
+ - `gradient_checkpointing_kwargs`: None
394
+ - `include_inputs_for_metrics`: False
395
+ - `include_for_metrics`: []
396
+ - `eval_do_concat_batches`: True
397
+ - `fp16_backend`: auto
398
+ - `push_to_hub_model_id`: None
399
+ - `push_to_hub_organization`: None
400
+ - `mp_parameters`:
401
+ - `auto_find_batch_size`: False
402
+ - `full_determinism`: False
403
+ - `torchdynamo`: None
404
+ - `ray_scope`: last
405
+ - `ddp_timeout`: 1800
406
+ - `torch_compile`: False
407
+ - `torch_compile_backend`: None
408
+ - `torch_compile_mode`: None
409
+ - `dispatch_batches`: None
410
+ - `split_batches`: None
411
+ - `include_tokens_per_second`: False
412
+ - `include_num_input_tokens_seen`: False
413
+ - `neftune_noise_alpha`: None
414
+ - `optim_target_modules`: None
415
+ - `batch_eval_metrics`: False
416
+ - `eval_on_start`: False
417
+ - `use_liger_kernel`: False
418
+ - `eval_use_gather_object`: False
419
+ - `average_tokens_across_devices`: False
420
+ - `prompts`: None
421
+ - `batch_sampler`: batch_sampler
422
+ - `multi_dataset_batch_sampler`: proportional
423
+
424
+ </details>
425
+
426
+ ### Training Logs
427
+ | Epoch | Step | Training Loss | gooaq-dev_ndcg@10 | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
428
+ |:----------:|:--------:|:-------------:|:--------------------:|:------------------------:|:-------------------------:|:--------------------:|:--------------------------:|
429
+ | -1 | -1 | - | 0.1488 (-0.4424) | 0.0573 (-0.4832) | 0.2647 (-0.0604) | 0.0388 (-0.4619) | 0.1202 (-0.3351) |
430
+ | 0.0001 | 1 | 1.3143 | - | - | - | - | - |
431
+ | 0.0265 | 200 | 1.2539 | - | - | - | - | - |
432
+ | 0.0531 | 400 | 0.9497 | - | - | - | - | - |
433
+ | 0.0796 | 600 | 0.5613 | - | - | - | - | - |
434
+ | 0.1061 | 800 | 0.4687 | - | - | - | - | - |
435
+ | 0.1327 | 1000 | 0.4042 | 0.7103 (+0.1191) | 0.5262 (-0.0142) | 0.3298 (+0.0048) | 0.5589 (+0.0583) | 0.4717 (+0.0163) |
436
+ | 0.1592 | 1200 | 0.3562 | - | - | - | - | - |
437
+ | 0.1857 | 1400 | 0.3543 | - | - | - | - | - |
438
+ | 0.2123 | 1600 | 0.3467 | - | - | - | - | - |
439
+ | 0.2388 | 1800 | 0.3153 | - | - | - | - | - |
440
+ | 0.2653 | 2000 | 0.3033 | 0.7317 (+0.1405) | 0.5662 (+0.0258) | 0.3859 (+0.0609) | 0.6828 (+0.1822) | 0.5450 (+0.0896) |
441
+ | 0.2919 | 2200 | 0.2986 | - | - | - | - | - |
442
+ | 0.3184 | 2400 | 0.3016 | - | - | - | - | - |
443
+ | 0.3449 | 2600 | 0.2984 | - | - | - | - | - |
444
+ | 0.3715 | 2800 | 0.2646 | - | - | - | - | - |
445
+ | 0.3980 | 3000 | 0.3048 | 0.7359 (+0.1447) | 0.5713 (+0.0309) | 0.3987 (+0.0736) | 0.6960 (+0.1953) | 0.5553 (+0.1000) |
446
+ | 0.4245 | 3200 | 0.2714 | - | - | - | - | - |
447
+ | 0.4510 | 3400 | 0.2773 | - | - | - | - | - |
448
+ | 0.4776 | 3600 | 0.2621 | - | - | - | - | - |
449
+ | 0.5041 | 3800 | 0.2529 | - | - | - | - | - |
450
+ | 0.5306 | 4000 | 0.2533 | 0.7459 (+0.1546) | 0.5893 (+0.0489) | 0.3887 (+0.0637) | 0.6749 (+0.1743) | 0.5510 (+0.0956) |
451
+ | 0.5572 | 4200 | 0.2822 | - | - | - | - | - |
452
+ | 0.5837 | 4400 | 0.2299 | - | - | - | - | - |
453
+ | 0.6102 | 4600 | 0.2554 | - | - | - | - | - |
454
+ | 0.6368 | 4800 | 0.2373 | - | - | - | - | - |
455
+ | 0.6633 | 5000 | 0.2248 | 0.7497 (+0.1584) | 0.6110 (+0.0706) | 0.3782 (+0.0531) | 0.6885 (+0.1878) | 0.5592 (+0.1038) |
456
+ | 0.6898 | 5200 | 0.2315 | - | - | - | - | - |
457
+ | 0.7164 | 5400 | 0.2313 | - | - | - | - | - |
458
+ | 0.7429 | 5600 | 0.2294 | - | - | - | - | - |
459
+ | 0.7694 | 5800 | 0.2384 | - | - | - | - | - |
460
+ | 0.7960 | 6000 | 0.2195 | 0.7530 (+0.1617) | 0.6249 (+0.0845) | 0.3873 (+0.0623) | 0.6773 (+0.1766) | 0.5632 (+0.1078) |
461
+ | 0.8225 | 6200 | 0.2047 | - | - | - | - | - |
462
+ | 0.8490 | 6400 | 0.2192 | - | - | - | - | - |
463
+ | 0.8756 | 6600 | 0.1926 | - | - | - | - | - |
464
+ | 0.9021 | 6800 | 0.2185 | - | - | - | - | - |
465
+ | **0.9286** | **7000** | **0.2365** | **0.7550 (+0.1637)** | **0.6229 (+0.0825)** | **0.3792 (+0.0541)** | **0.6915 (+0.1908)** | **0.5645 (+0.1092)** |
466
+ | 0.9552 | 7200 | 0.2173 | - | - | - | - | - |
467
+ | 0.9817 | 7400 | 0.2249 | - | - | - | - | - |
468
+ | -1 | -1 | - | 0.7550 (+0.1637) | 0.6229 (+0.0825) | 0.3792 (+0.0541) | 0.6915 (+0.1908) | 0.5645 (+0.1092) |
469
+
470
+ * The bold row denotes the saved checkpoint.
471
+
472
+ ### Framework Versions
473
+ - Python: 3.11.10
474
+ - Sentence Transformers: 3.5.0.dev0
475
+ - Transformers: 4.49.0
476
+ - PyTorch: 2.5.1+cu124
477
+ - Accelerate: 1.5.2
478
+ - Datasets: 2.21.0
479
+ - Tokenizers: 0.21.0
480
+
481
+ ## Citation
482
+
483
+ ### BibTeX
484
+
485
+ #### Sentence Transformers
486
+ ```bibtex
487
+ @inproceedings{reimers-2019-sentence-bert,
488
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
489
+ author = "Reimers, Nils and Gurevych, Iryna",
490
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
491
+ month = "11",
492
+ year = "2019",
493
+ publisher = "Association for Computational Linguistics",
494
+ url = "https://arxiv.org/abs/1908.10084",
495
+ }
496
+ ```
497
+
498
+ <!--
499
+ ## Glossary
500
+
501
+ *Clearly define terms in order to be accessible across audiences.*
502
+ -->
503
+
504
+ <!--
505
+ ## Model Card Authors
506
+
507
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
508
+ -->
509
+
510
+ <!--
511
+ ## Model Card Contact
512
+
513
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
514
+ -->
config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "answerdotai/ModernBERT-base",
3
+ "architectures": [
4
+ "ModernBertForSequenceClassification"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "id2label": {
24
+ "0": "LABEL_0"
25
+ },
26
+ "initializer_cutoff_factor": 2.0,
27
+ "initializer_range": 0.02,
28
+ "intermediate_size": 1152,
29
+ "label2id": {
30
+ "LABEL_0": 0
31
+ },
32
+ "layer_norm_eps": 1e-05,
33
+ "local_attention": 128,
34
+ "local_rope_theta": 10000.0,
35
+ "max_position_embeddings": 8192,
36
+ "mlp_bias": false,
37
+ "mlp_dropout": 0.0,
38
+ "model_type": "modernbert",
39
+ "norm_bias": false,
40
+ "norm_eps": 1e-05,
41
+ "num_attention_heads": 12,
42
+ "num_hidden_layers": 22,
43
+ "pad_token_id": 50283,
44
+ "position_embedding_type": "absolute",
45
+ "reference_compile": true,
46
+ "repad_logits_with_grad": false,
47
+ "sentence_transformers": {
48
+ "activation_fn": "torch.nn.modules.activation.Sigmoid"
49
+ },
50
+ "sep_token_id": 50282,
51
+ "sparse_pred_ignore_index": -100,
52
+ "sparse_prediction": false,
53
+ "torch_dtype": "float32",
54
+ "transformers_version": "4.49.0",
55
+ "vocab_size": 50368
56
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa71065db93c0ac9a96a20447e537a58c29091b33f5c6e6e9175a00a485cf4d9
3
+ size 598436708
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizer",
944
+ "unk_token": "[UNK]"
945
+ }