tomaarsen HF Staff commited on
Commit
01ab7d7
·
verified ·
1 Parent(s): 8a623fd

Add new SparseEncoder model

Browse files
1_SpladePooling/config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "pooling_strategy": "max",
3
+ "activation_function": "relu",
4
+ "word_embedding_dimension": 30522
5
+ }
README.md ADDED
@@ -0,0 +1,835 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sparse-encoder
8
+ - sparse
9
+ - splade
10
+ - generated_from_trainer
11
+ - dataset_size:99000
12
+ - loss:SpladeLoss
13
+ - loss:SparseMultipleNegativesRankingLoss
14
+ - loss:FlopsLoss
15
+ base_model: distilbert/distilbert-base-uncased
16
+ widget:
17
+ - text: Rollin' (Limp Bizkit song) The music video was filmed atop the South Tower
18
+ of the former World Trade Center in New York City. The introduction features Ben
19
+ Stiller and Stephen Dorff mistaking Fred Durst for the valet and giving him the
20
+ keys to their Bentley Azure. Also making a cameo is break dancer Mr. Wiggles.
21
+ The rest of the video has several cuts to Durst and his bandmates hanging out
22
+ of the Bentley as they drive about Manhattan. The song Ben Stiller is playing
23
+ at the beginning is "My Generation" from the same album. The video also features
24
+ scenes of Fred Durst with five girls dancing in a room. The video was filmed around
25
+ the same time as the film Zoolander, which explains Stiller and Dorff's appearance.
26
+ Fred Durst has a small cameo in that film.
27
+ - text: 'Maze Runner: The Death Cure On April 22, 2017, the studio delayed the release
28
+ date once again, to February 9, 2018, in order to allow more time for post-production;
29
+ months later, on August 25, the studio moved the release forward two weeks.[17]
30
+ The film will premiere on January 26, 2018 in 3D, IMAX and IMAX 3D.[18][19]'
31
+ - text: who played the dj in the movie the warriors
32
+ - text: Lionel Messi Born and raised in central Argentina, Messi was diagnosed with
33
+ a growth hormone deficiency as a child. At age 13, he relocated to Spain to join
34
+ Barcelona, who agreed to pay for his medical treatment. After a fast progression
35
+ through Barcelona's youth academy, Messi made his competitive debut aged 17 in
36
+ October 2004. Despite being injury-prone during his early career, he established
37
+ himself as an integral player for the club within the next three years, finishing
38
+ 2007 as a finalist for both the Ballon d'Or and FIFA World Player of the Year
39
+ award, a feat he repeated the following year. His first uninterrupted campaign
40
+ came in the 2008–09 season, during which he helped Barcelona achieve the first
41
+ treble in Spanish football. At 22 years old, Messi won the Ballon d'Or and FIFA
42
+ World Player of the Year award by record voting margins.
43
+ - text: 'Send In the Clowns "Send In the Clowns" is a song written by Stephen Sondheim
44
+ for the 1973 musical A Little Night Music, an adaptation of Ingmar Bergman''s
45
+ film Smiles of a Summer Night. It is a ballad from Act Two, in which the character
46
+ Desirée reflects on the ironies and disappointments of her life. Among other things,
47
+ she looks back on an affair years earlier with the lawyer Fredrik, who was deeply
48
+ in love with her but whose marriage proposals she had rejected. Meeting him after
49
+ so long, she realizes she is in love with him and finally ready to marry him,
50
+ but now it is he who rejects her: he is in an unconsummated marriage with a much
51
+ younger woman. Desirée proposes marriage to rescue him from this situation, but
52
+ he declines, citing his dedication to his bride. Reacting to his rejection, Desirée
53
+ sings this song. The song is later reprised as a coda after Fredrik''s young wife
54
+ runs away with his son, and Fredrik is finally free to accept Desirée''s offer.[1]'
55
+ datasets:
56
+ - sentence-transformers/natural-questions
57
+ pipeline_tag: feature-extraction
58
+ library_name: sentence-transformers
59
+ metrics:
60
+ - dot_accuracy@1
61
+ - dot_accuracy@3
62
+ - dot_accuracy@5
63
+ - dot_accuracy@10
64
+ - dot_precision@1
65
+ - dot_precision@3
66
+ - dot_precision@5
67
+ - dot_precision@10
68
+ - dot_recall@1
69
+ - dot_recall@3
70
+ - dot_recall@5
71
+ - dot_recall@10
72
+ - dot_ndcg@10
73
+ - dot_mrr@10
74
+ - dot_map@100
75
+ - query_active_dims
76
+ - query_sparsity_ratio
77
+ - corpus_active_dims
78
+ - corpus_sparsity_ratio
79
+ co2_eq_emissions:
80
+ emissions: 36.35355068873359
81
+ energy_consumed: 0.0935255045992395
82
+ source: codecarbon
83
+ training_type: fine-tuning
84
+ on_cloud: false
85
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
86
+ ram_total_size: 31.777088165283203
87
+ hours_used: 0.252
88
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
89
+ model-index:
90
+ - name: DistilBERT base trained on Natural-Questions tuples
91
+ results:
92
+ - task:
93
+ type: sparse-information-retrieval
94
+ name: Sparse Information Retrieval
95
+ dataset:
96
+ name: NanoMSMARCO
97
+ type: NanoMSMARCO
98
+ metrics:
99
+ - type: dot_accuracy@1
100
+ value: 0.28
101
+ name: Dot Accuracy@1
102
+ - type: dot_accuracy@3
103
+ value: 0.4
104
+ name: Dot Accuracy@3
105
+ - type: dot_accuracy@5
106
+ value: 0.62
107
+ name: Dot Accuracy@5
108
+ - type: dot_accuracy@10
109
+ value: 0.74
110
+ name: Dot Accuracy@10
111
+ - type: dot_precision@1
112
+ value: 0.28
113
+ name: Dot Precision@1
114
+ - type: dot_precision@3
115
+ value: 0.13333333333333333
116
+ name: Dot Precision@3
117
+ - type: dot_precision@5
118
+ value: 0.124
119
+ name: Dot Precision@5
120
+ - type: dot_precision@10
121
+ value: 0.07400000000000001
122
+ name: Dot Precision@10
123
+ - type: dot_recall@1
124
+ value: 0.28
125
+ name: Dot Recall@1
126
+ - type: dot_recall@3
127
+ value: 0.4
128
+ name: Dot Recall@3
129
+ - type: dot_recall@5
130
+ value: 0.62
131
+ name: Dot Recall@5
132
+ - type: dot_recall@10
133
+ value: 0.74
134
+ name: Dot Recall@10
135
+ - type: dot_ndcg@10
136
+ value: 0.48417691239896954
137
+ name: Dot Ndcg@10
138
+ - type: dot_mrr@10
139
+ value: 0.40474603174603174
140
+ name: Dot Mrr@10
141
+ - type: dot_map@100
142
+ value: 0.4165931820854422
143
+ name: Dot Map@100
144
+ - type: query_active_dims
145
+ value: 68.80000305175781
146
+ name: Query Active Dims
147
+ - type: query_sparsity_ratio
148
+ value: 0.9977458881117962
149
+ name: Query Sparsity Ratio
150
+ - type: corpus_active_dims
151
+ value: 135.5758514404297
152
+ name: Corpus Active Dims
153
+ - type: corpus_sparsity_ratio
154
+ value: 0.9955580941143952
155
+ name: Corpus Sparsity Ratio
156
+ - task:
157
+ type: sparse-information-retrieval
158
+ name: Sparse Information Retrieval
159
+ dataset:
160
+ name: NanoNFCorpus
161
+ type: NanoNFCorpus
162
+ metrics:
163
+ - type: dot_accuracy@1
164
+ value: 0.42
165
+ name: Dot Accuracy@1
166
+ - type: dot_accuracy@3
167
+ value: 0.5
168
+ name: Dot Accuracy@3
169
+ - type: dot_accuracy@5
170
+ value: 0.52
171
+ name: Dot Accuracy@5
172
+ - type: dot_accuracy@10
173
+ value: 0.56
174
+ name: Dot Accuracy@10
175
+ - type: dot_precision@1
176
+ value: 0.42
177
+ name: Dot Precision@1
178
+ - type: dot_precision@3
179
+ value: 0.36
180
+ name: Dot Precision@3
181
+ - type: dot_precision@5
182
+ value: 0.308
183
+ name: Dot Precision@5
184
+ - type: dot_precision@10
185
+ value: 0.234
186
+ name: Dot Precision@10
187
+ - type: dot_recall@1
188
+ value: 0.024688245739830684
189
+ name: Dot Recall@1
190
+ - type: dot_recall@3
191
+ value: 0.05757259881654739
192
+ name: Dot Recall@3
193
+ - type: dot_recall@5
194
+ value: 0.07457503506379409
195
+ name: Dot Recall@5
196
+ - type: dot_recall@10
197
+ value: 0.09455914797791706
198
+ name: Dot Recall@10
199
+ - type: dot_ndcg@10
200
+ value: 0.2854029431260111
201
+ name: Dot Ndcg@10
202
+ - type: dot_mrr@10
203
+ value: 0.46341269841269833
204
+ name: Dot Mrr@10
205
+ - type: dot_map@100
206
+ value: 0.11792914877304508
207
+ name: Dot Map@100
208
+ - type: query_active_dims
209
+ value: 79.31999969482422
210
+ name: Query Active Dims
211
+ - type: query_sparsity_ratio
212
+ value: 0.9974012188030004
213
+ name: Query Sparsity Ratio
214
+ - type: corpus_active_dims
215
+ value: 184.8435516357422
216
+ name: Corpus Active Dims
217
+ - type: corpus_sparsity_ratio
218
+ value: 0.9939439240011879
219
+ name: Corpus Sparsity Ratio
220
+ - task:
221
+ type: sparse-information-retrieval
222
+ name: Sparse Information Retrieval
223
+ dataset:
224
+ name: NanoNQ
225
+ type: NanoNQ
226
+ metrics:
227
+ - type: dot_accuracy@1
228
+ value: 0.4
229
+ name: Dot Accuracy@1
230
+ - type: dot_accuracy@3
231
+ value: 0.64
232
+ name: Dot Accuracy@3
233
+ - type: dot_accuracy@5
234
+ value: 0.72
235
+ name: Dot Accuracy@5
236
+ - type: dot_accuracy@10
237
+ value: 0.76
238
+ name: Dot Accuracy@10
239
+ - type: dot_precision@1
240
+ value: 0.4
241
+ name: Dot Precision@1
242
+ - type: dot_precision@3
243
+ value: 0.21333333333333332
244
+ name: Dot Precision@3
245
+ - type: dot_precision@5
246
+ value: 0.14400000000000002
247
+ name: Dot Precision@5
248
+ - type: dot_precision@10
249
+ value: 0.07600000000000001
250
+ name: Dot Precision@10
251
+ - type: dot_recall@1
252
+ value: 0.38
253
+ name: Dot Recall@1
254
+ - type: dot_recall@3
255
+ value: 0.61
256
+ name: Dot Recall@3
257
+ - type: dot_recall@5
258
+ value: 0.68
259
+ name: Dot Recall@5
260
+ - type: dot_recall@10
261
+ value: 0.7
262
+ name: Dot Recall@10
263
+ - type: dot_ndcg@10
264
+ value: 0.562112822249959
265
+ name: Dot Ndcg@10
266
+ - type: dot_mrr@10
267
+ value: 0.535079365079365
268
+ name: Dot Mrr@10
269
+ - type: dot_map@100
270
+ value: 0.5164611300715877
271
+ name: Dot Map@100
272
+ - type: query_active_dims
273
+ value: 54.099998474121094
274
+ name: Query Active Dims
275
+ - type: query_sparsity_ratio
276
+ value: 0.9982275080769897
277
+ name: Query Sparsity Ratio
278
+ - type: corpus_active_dims
279
+ value: 133.11419677734375
280
+ name: Corpus Active Dims
281
+ - type: corpus_sparsity_ratio
282
+ value: 0.9956387459282701
283
+ name: Corpus Sparsity Ratio
284
+ - task:
285
+ type: sparse-nano-beir
286
+ name: Sparse Nano BEIR
287
+ dataset:
288
+ name: NanoBEIR mean
289
+ type: NanoBEIR_mean
290
+ metrics:
291
+ - type: dot_accuracy@1
292
+ value: 0.3666666666666667
293
+ name: Dot Accuracy@1
294
+ - type: dot_accuracy@3
295
+ value: 0.5133333333333333
296
+ name: Dot Accuracy@3
297
+ - type: dot_accuracy@5
298
+ value: 0.62
299
+ name: Dot Accuracy@5
300
+ - type: dot_accuracy@10
301
+ value: 0.6866666666666666
302
+ name: Dot Accuracy@10
303
+ - type: dot_precision@1
304
+ value: 0.3666666666666667
305
+ name: Dot Precision@1
306
+ - type: dot_precision@3
307
+ value: 0.23555555555555552
308
+ name: Dot Precision@3
309
+ - type: dot_precision@5
310
+ value: 0.19200000000000003
311
+ name: Dot Precision@5
312
+ - type: dot_precision@10
313
+ value: 0.12800000000000003
314
+ name: Dot Precision@10
315
+ - type: dot_recall@1
316
+ value: 0.22822941524661022
317
+ name: Dot Recall@1
318
+ - type: dot_recall@3
319
+ value: 0.35585753293884914
320
+ name: Dot Recall@3
321
+ - type: dot_recall@5
322
+ value: 0.45819167835459806
323
+ name: Dot Recall@5
324
+ - type: dot_recall@10
325
+ value: 0.511519715992639
326
+ name: Dot Recall@10
327
+ - type: dot_ndcg@10
328
+ value: 0.4438975592583132
329
+ name: Dot Ndcg@10
330
+ - type: dot_mrr@10
331
+ value: 0.46774603174603174
332
+ name: Dot Mrr@10
333
+ - type: dot_map@100
334
+ value: 0.350327820310025
335
+ name: Dot Map@100
336
+ - type: query_active_dims
337
+ value: 67.4066670735677
338
+ name: Query Active Dims
339
+ - type: query_sparsity_ratio
340
+ value: 0.9977915383305954
341
+ name: Query Sparsity Ratio
342
+ - type: corpus_active_dims
343
+ value: 145.78942579758726
344
+ name: Corpus Active Dims
345
+ - type: corpus_sparsity_ratio
346
+ value: 0.9952234641963965
347
+ name: Corpus Sparsity Ratio
348
+ ---
349
+
350
+ # DistilBERT base trained on Natural-Questions tuples
351
+
352
+ This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
353
+ ## Model Details
354
+
355
+ ### Model Description
356
+ - **Model Type:** SPLADE Sparse Encoder
357
+ - **Base model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) <!-- at revision 12040accade4e8a0f71eabdb258fecc2e7e948be -->
358
+ - **Maximum Sequence Length:** 512 tokens
359
+ - **Output Dimensionality:** 30522 dimensions
360
+ - **Similarity Function:** Dot Product
361
+ - **Training Dataset:**
362
+ - [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
363
+ - **Language:** en
364
+ - **License:** apache-2.0
365
+
366
+ ### Model Sources
367
+
368
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
369
+ - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
370
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
371
+ - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
372
+
373
+ ### Full Model Architecture
374
+
375
+ ```
376
+ SparseEncoder(
377
+ (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False}) with MLMTransformer model: DistilBertForMaskedLM
378
+ (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
379
+ )
380
+ ```
381
+
382
+ ## Usage
383
+
384
+ ### Direct Usage (Sentence Transformers)
385
+
386
+ First install the Sentence Transformers library:
387
+
388
+ ```bash
389
+ pip install -U sentence-transformers
390
+ ```
391
+
392
+ Then you can load this model and run inference.
393
+ ```python
394
+ from sentence_transformers import SparseEncoder
395
+
396
+ # Download from the 🤗 Hub
397
+ model = SparseEncoder("tomaarsen/splade-distilbert-base-uncased-nq")
398
+ # Run inference
399
+ queries = [
400
+ "is send in the clowns from a musical",
401
+ ]
402
+ documents = [
403
+ 'Send In the Clowns "Send In the Clowns" is a song written by Stephen Sondheim for the 1973 musical A Little Night Music, an adaptation of Ingmar Bergman\'s film Smiles of a Summer Night. It is a ballad from Act Two, in which the character Desirée reflects on the ironies and disappointments of her life. Among other things, she looks back on an affair years earlier with the lawyer Fredrik, who was deeply in love with her but whose marriage proposals she had rejected. Meeting him after so long, she realizes she is in love with him and finally ready to marry him, but now it is he who rejects her: he is in an unconsummated marriage with a much younger woman. Desirée proposes marriage to rescue him from this situation, but he declines, citing his dedication to his bride. Reacting to his rejection, Desirée sings this song. The song is later reprised as a coda after Fredrik\'s young wife runs away with his son, and Fredrik is finally free to accept Desirée\'s offer.[1]',
404
+ 'The Suite Life on Deck The Suite Life on Deck is an American sitcom that aired on Disney Channel from September 26, 2008 to May 6, 2011. It is a sequel/spin-off of the Disney Channel Original Series The Suite Life of Zack & Cody. The series follows twin brothers Zack and Cody Martin and hotel heiress London Tipton in a new setting, the SS Tipton, where they attend classes at "Seven Seas High School" and meet Bailey Pickett while Mr. Moseby manages the ship. The ship travels around the world to nations such as Italy, France, Greece, India, Sweden and the United Kingdom where the characters experience different cultures, adventures, and situations.[1]',
405
+ 'Money in the Bank ladder match The first match was contested in 2005 at WrestleMania 21, after being invented (in kayfabe) by Chris Jericho.[1] At the time, it was exclusive to wrestlers of the Raw brand, and Edge won the inaugural match.[1] From then until 2010, the Money in the Bank ladder match, now open to all WWE brands, became a WrestleMania mainstay. 2010 saw a second and third Money in the Bank ladder match when the Money in the Bank pay-per-view debuted in July. Unlike the matches at WrestleMania, this new event featured two such ladder matches – one each for a contract for the WWE Championship and World Heavyweight Championship, respectively.',
406
+ ]
407
+ query_embeddings = model.encode_query(queries)
408
+ document_embeddings = model.encode_document(documents)
409
+ print(query_embeddings.shape, document_embeddings.shape)
410
+ # [1, 30522] [3, 30522]
411
+
412
+ # Get the similarity scores for the embeddings
413
+ similarities = model.similarity(query_embeddings, document_embeddings)
414
+ print(similarities)
415
+ # tensor([[27.6088, 3.8288, 3.8780]])
416
+ ```
417
+
418
+ <!--
419
+ ### Direct Usage (Transformers)
420
+
421
+ <details><summary>Click to see the direct usage in Transformers</summary>
422
+
423
+ </details>
424
+ -->
425
+
426
+ <!--
427
+ ### Downstream Usage (Sentence Transformers)
428
+
429
+ You can finetune this model on your own dataset.
430
+
431
+ <details><summary>Click to expand</summary>
432
+
433
+ </details>
434
+ -->
435
+
436
+ <!--
437
+ ### Out-of-Scope Use
438
+
439
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
440
+ -->
441
+
442
+ ## Evaluation
443
+
444
+ ### Metrics
445
+
446
+ #### Sparse Information Retrieval
447
+
448
+ * Datasets: `NanoMSMARCO`, `NanoNFCorpus` and `NanoNQ`
449
+ * Evaluated with [<code>SparseInformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator)
450
+
451
+ | Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ |
452
+ |:----------------------|:------------|:-------------|:-----------|
453
+ | dot_accuracy@1 | 0.28 | 0.42 | 0.4 |
454
+ | dot_accuracy@3 | 0.4 | 0.5 | 0.64 |
455
+ | dot_accuracy@5 | 0.62 | 0.52 | 0.72 |
456
+ | dot_accuracy@10 | 0.74 | 0.56 | 0.76 |
457
+ | dot_precision@1 | 0.28 | 0.42 | 0.4 |
458
+ | dot_precision@3 | 0.1333 | 0.36 | 0.2133 |
459
+ | dot_precision@5 | 0.124 | 0.308 | 0.144 |
460
+ | dot_precision@10 | 0.074 | 0.234 | 0.076 |
461
+ | dot_recall@1 | 0.28 | 0.0247 | 0.38 |
462
+ | dot_recall@3 | 0.4 | 0.0576 | 0.61 |
463
+ | dot_recall@5 | 0.62 | 0.0746 | 0.68 |
464
+ | dot_recall@10 | 0.74 | 0.0946 | 0.7 |
465
+ | **dot_ndcg@10** | **0.4842** | **0.2854** | **0.5621** |
466
+ | dot_mrr@10 | 0.4047 | 0.4634 | 0.5351 |
467
+ | dot_map@100 | 0.4166 | 0.1179 | 0.5165 |
468
+ | query_active_dims | 68.8 | 79.32 | 54.1 |
469
+ | query_sparsity_ratio | 0.9977 | 0.9974 | 0.9982 |
470
+ | corpus_active_dims | 135.5759 | 184.8436 | 133.1142 |
471
+ | corpus_sparsity_ratio | 0.9956 | 0.9939 | 0.9956 |
472
+
473
+ #### Sparse Nano BEIR
474
+
475
+ * Dataset: `NanoBEIR_mean`
476
+ * Evaluated with [<code>SparseNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters:
477
+ ```json
478
+ {
479
+ "dataset_names": [
480
+ "msmarco",
481
+ "nfcorpus",
482
+ "nq"
483
+ ]
484
+ }
485
+ ```
486
+
487
+ | Metric | Value |
488
+ |:----------------------|:-----------|
489
+ | dot_accuracy@1 | 0.3667 |
490
+ | dot_accuracy@3 | 0.5133 |
491
+ | dot_accuracy@5 | 0.62 |
492
+ | dot_accuracy@10 | 0.6867 |
493
+ | dot_precision@1 | 0.3667 |
494
+ | dot_precision@3 | 0.2356 |
495
+ | dot_precision@5 | 0.192 |
496
+ | dot_precision@10 | 0.128 |
497
+ | dot_recall@1 | 0.2282 |
498
+ | dot_recall@3 | 0.3559 |
499
+ | dot_recall@5 | 0.4582 |
500
+ | dot_recall@10 | 0.5115 |
501
+ | **dot_ndcg@10** | **0.4439** |
502
+ | dot_mrr@10 | 0.4677 |
503
+ | dot_map@100 | 0.3503 |
504
+ | query_active_dims | 67.4067 |
505
+ | query_sparsity_ratio | 0.9978 |
506
+ | corpus_active_dims | 145.7894 |
507
+ | corpus_sparsity_ratio | 0.9952 |
508
+
509
+ <!--
510
+ ## Bias, Risks and Limitations
511
+
512
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
513
+ -->
514
+
515
+ <!--
516
+ ### Recommendations
517
+
518
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
519
+ -->
520
+
521
+ ## Training Details
522
+
523
+ ### Training Dataset
524
+
525
+ #### natural-questions
526
+
527
+ * Dataset: [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
528
+ * Size: 99,000 training samples
529
+ * Columns: <code>query</code> and <code>answer</code>
530
+ * Approximate statistics based on the first 1000 samples:
531
+ | | query | answer |
532
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
533
+ | type | string | string |
534
+ | details | <ul><li>min: 10 tokens</li><li>mean: 11.71 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 131.81 tokens</li><li>max: 450 tokens</li></ul> |
535
+ * Samples:
536
+ | query | answer |
537
+ |:--------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
538
+ | <code>who played the father in papa don't preach</code> | <code>Alex McArthur Alex McArthur (born March 6, 1957) is an American actor.</code> |
539
+ | <code>where was the location of the battle of hastings</code> | <code>Battle of Hastings The Battle of Hastings[a] was fought on 14 October 1066 between the Norman-French army of William, the Duke of Normandy, and an English army under the Anglo-Saxon King Harold Godwinson, beginning the Norman conquest of England. It took place approximately 7 miles (11 kilometres) northwest of Hastings, close to the present-day town of Battle, East Sussex, and was a decisive Norman victory.</code> |
540
+ | <code>how many puppies can a dog give birth to</code> | <code>Canine reproduction The largest litter size to date was set by a Neapolitan Mastiff in Manea, Cambridgeshire, UK on November 29, 2004; the litter was 24 puppies.[22]</code> |
541
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
542
+ ```json
543
+ {
544
+ "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
545
+ "lambda_corpus": 3e-05,
546
+ "lambda_query": 5e-05
547
+ }
548
+ ```
549
+
550
+ ### Evaluation Dataset
551
+
552
+ #### natural-questions
553
+
554
+ * Dataset: [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
555
+ * Size: 1,000 evaluation samples
556
+ * Columns: <code>query</code> and <code>answer</code>
557
+ * Approximate statistics based on the first 1000 samples:
558
+ | | query | answer |
559
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
560
+ | type | string | string |
561
+ | details | <ul><li>min: 10 tokens</li><li>mean: 11.69 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 134.01 tokens</li><li>max: 512 tokens</li></ul> |
562
+ * Samples:
563
+ | query | answer |
564
+ |:-------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
565
+ | <code>where is the tiber river located in italy</code> | <code>Tiber The Tiber (/ˈtaɪbər/, Latin: Tiberis,[1] Italian: Tevere [ˈteːvere])[2] is the third-longest river in Italy, rising in the Apennine Mountains in Emilia-Romagna and flowing 406 kilometres (252 mi) through Tuscany, Umbria and Lazio, where it is joined by the river Aniene, to the Tyrrhenian Sea, between Ostia and Fiumicino.[3] It drains a basin estimated at 17,375 square kilometres (6,709 sq mi). The river has achieved lasting fame as the main watercourse of the city of Rome, founded on its eastern banks.</code> |
566
+ | <code>what kind of car does jay gatsby drive</code> | <code>Jay Gatsby At the Buchanan home, Jordan Baker, Nick, Jay, and the Buchanans decide to visit New York City. Tom borrows Gatsby's yellow Rolls Royce to drive up to the city. On the way to New York City, Tom makes a detour at a gas station in "the Valley of Ashes", a run-down part of Long Island. The owner, George Wilson, shares his concern that his wife, Myrtle, may be having an affair. This unnerves Tom, who has been having an affair with Myrtle, and he leaves in a hurry.</code> |
567
+ | <code>who sings if i can dream about you</code> | <code>I Can Dream About You "I Can Dream About You" is a song performed by American singer Dan Hartman on the soundtrack album of the film Streets of Fire. Released in 1984 as a single from the soundtrack, and included on Hartman's album I Can Dream About You, it reached number 6 on the Billboard Hot 100.[1]</code> |
568
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
569
+ ```json
570
+ {
571
+ "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
572
+ "lambda_corpus": 3e-05,
573
+ "lambda_query": 5e-05
574
+ }
575
+ ```
576
+
577
+ ### Training Hyperparameters
578
+ #### Non-Default Hyperparameters
579
+
580
+ - `eval_strategy`: steps
581
+ - `per_device_train_batch_size`: 16
582
+ - `per_device_eval_batch_size`: 16
583
+ - `learning_rate`: 2e-05
584
+ - `num_train_epochs`: 1
585
+ - `warmup_ratio`: 0.1
586
+ - `fp16`: True
587
+ - `batch_sampler`: no_duplicates
588
+
589
+ #### All Hyperparameters
590
+ <details><summary>Click to expand</summary>
591
+
592
+ - `overwrite_output_dir`: False
593
+ - `do_predict`: False
594
+ - `eval_strategy`: steps
595
+ - `prediction_loss_only`: True
596
+ - `per_device_train_batch_size`: 16
597
+ - `per_device_eval_batch_size`: 16
598
+ - `per_gpu_train_batch_size`: None
599
+ - `per_gpu_eval_batch_size`: None
600
+ - `gradient_accumulation_steps`: 1
601
+ - `eval_accumulation_steps`: None
602
+ - `torch_empty_cache_steps`: None
603
+ - `learning_rate`: 2e-05
604
+ - `weight_decay`: 0.0
605
+ - `adam_beta1`: 0.9
606
+ - `adam_beta2`: 0.999
607
+ - `adam_epsilon`: 1e-08
608
+ - `max_grad_norm`: 1.0
609
+ - `num_train_epochs`: 1
610
+ - `max_steps`: -1
611
+ - `lr_scheduler_type`: linear
612
+ - `lr_scheduler_kwargs`: {}
613
+ - `warmup_ratio`: 0.1
614
+ - `warmup_steps`: 0
615
+ - `log_level`: passive
616
+ - `log_level_replica`: warning
617
+ - `log_on_each_node`: True
618
+ - `logging_nan_inf_filter`: True
619
+ - `save_safetensors`: True
620
+ - `save_on_each_node`: False
621
+ - `save_only_model`: False
622
+ - `restore_callback_states_from_checkpoint`: False
623
+ - `no_cuda`: False
624
+ - `use_cpu`: False
625
+ - `use_mps_device`: False
626
+ - `seed`: 42
627
+ - `data_seed`: None
628
+ - `jit_mode_eval`: False
629
+ - `use_ipex`: False
630
+ - `bf16`: False
631
+ - `fp16`: True
632
+ - `fp16_opt_level`: O1
633
+ - `half_precision_backend`: auto
634
+ - `bf16_full_eval`: False
635
+ - `fp16_full_eval`: False
636
+ - `tf32`: None
637
+ - `local_rank`: 0
638
+ - `ddp_backend`: None
639
+ - `tpu_num_cores`: None
640
+ - `tpu_metrics_debug`: False
641
+ - `debug`: []
642
+ - `dataloader_drop_last`: False
643
+ - `dataloader_num_workers`: 0
644
+ - `dataloader_prefetch_factor`: None
645
+ - `past_index`: -1
646
+ - `disable_tqdm`: False
647
+ - `remove_unused_columns`: True
648
+ - `label_names`: None
649
+ - `load_best_model_at_end`: False
650
+ - `ignore_data_skip`: False
651
+ - `fsdp`: []
652
+ - `fsdp_min_num_params`: 0
653
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
654
+ - `fsdp_transformer_layer_cls_to_wrap`: None
655
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
656
+ - `deepspeed`: None
657
+ - `label_smoothing_factor`: 0.0
658
+ - `optim`: adamw_torch
659
+ - `optim_args`: None
660
+ - `adafactor`: False
661
+ - `group_by_length`: False
662
+ - `length_column_name`: length
663
+ - `ddp_find_unused_parameters`: None
664
+ - `ddp_bucket_cap_mb`: None
665
+ - `ddp_broadcast_buffers`: False
666
+ - `dataloader_pin_memory`: True
667
+ - `dataloader_persistent_workers`: False
668
+ - `skip_memory_metrics`: True
669
+ - `use_legacy_prediction_loop`: False
670
+ - `push_to_hub`: False
671
+ - `resume_from_checkpoint`: None
672
+ - `hub_model_id`: None
673
+ - `hub_strategy`: every_save
674
+ - `hub_private_repo`: None
675
+ - `hub_always_push`: False
676
+ - `gradient_checkpointing`: False
677
+ - `gradient_checkpointing_kwargs`: None
678
+ - `include_inputs_for_metrics`: False
679
+ - `include_for_metrics`: []
680
+ - `eval_do_concat_batches`: True
681
+ - `fp16_backend`: auto
682
+ - `push_to_hub_model_id`: None
683
+ - `push_to_hub_organization`: None
684
+ - `mp_parameters`:
685
+ - `auto_find_batch_size`: False
686
+ - `full_determinism`: False
687
+ - `torchdynamo`: None
688
+ - `ray_scope`: last
689
+ - `ddp_timeout`: 1800
690
+ - `torch_compile`: False
691
+ - `torch_compile_backend`: None
692
+ - `torch_compile_mode`: None
693
+ - `include_tokens_per_second`: False
694
+ - `include_num_input_tokens_seen`: False
695
+ - `neftune_noise_alpha`: None
696
+ - `optim_target_modules`: None
697
+ - `batch_eval_metrics`: False
698
+ - `eval_on_start`: False
699
+ - `use_liger_kernel`: False
700
+ - `eval_use_gather_object`: False
701
+ - `average_tokens_across_devices`: False
702
+ - `prompts`: None
703
+ - `batch_sampler`: no_duplicates
704
+ - `multi_dataset_batch_sampler`: proportional
705
+ - `router_mapping`: {}
706
+ - `learning_rate_mapping`: {}
707
+
708
+ </details>
709
+
710
+ ### Training Logs
711
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_dot_ndcg@10 | NanoNFCorpus_dot_ndcg@10 | NanoNQ_dot_ndcg@10 | NanoBEIR_mean_dot_ndcg@10 |
712
+ |:------:|:----:|:-------------:|:---------------:|:-----------------------:|:------------------------:|:------------------:|:-------------------------:|
713
+ | 0.0323 | 200 | 139.5463 | - | - | - | - | - |
714
+ | 0.0646 | 400 | 0.3152 | - | - | - | - | - |
715
+ | 0.0970 | 600 | 0.1291 | - | - | - | - | - |
716
+ | 0.1293 | 800 | 0.0783 | - | - | - | - | - |
717
+ | 0.1616 | 1000 | 0.0311 | 0.0839 | 0.4749 | 0.2698 | 0.5106 | 0.4184 |
718
+ | 0.1939 | 1200 | 0.0427 | - | - | - | - | - |
719
+ | 0.2262 | 1400 | 0.0368 | - | - | - | - | - |
720
+ | 0.2586 | 1600 | 0.042 | - | - | - | - | - |
721
+ | 0.2909 | 1800 | 0.0384 | - | - | - | - | - |
722
+ | 0.3232 | 2000 | 0.0429 | 0.0632 | 0.4251 | 0.2626 | 0.5297 | 0.4058 |
723
+ | 0.3555 | 2200 | 0.0304 | - | - | - | - | - |
724
+ | 0.3878 | 2400 | 0.0357 | - | - | - | - | - |
725
+ | 0.4202 | 2600 | 0.0294 | - | - | - | - | - |
726
+ | 0.4525 | 2800 | 0.0289 | - | - | - | - | - |
727
+ | 0.4848 | 3000 | 0.0287 | 0.0563 | 0.4496 | 0.2417 | 0.5590 | 0.4168 |
728
+ | 0.5171 | 3200 | 0.0269 | - | - | - | - | - |
729
+ | 0.5495 | 3400 | 0.0395 | - | - | - | - | - |
730
+ | 0.5818 | 3600 | 0.0191 | - | - | - | - | - |
731
+ | 0.6141 | 3800 | 0.0328 | - | - | - | - | - |
732
+ | 0.6464 | 4000 | 0.0295 | 0.0502 | 0.4882 | 0.2537 | 0.5795 | 0.4405 |
733
+ | 0.6787 | 4200 | 0.0155 | - | - | - | - | - |
734
+ | 0.7111 | 4400 | 0.0274 | - | - | - | - | - |
735
+ | 0.7434 | 4600 | 0.0324 | - | - | - | - | - |
736
+ | 0.7757 | 4800 | 0.0197 | - | - | - | - | - |
737
+ | 0.8080 | 5000 | 0.0178 | 0.0417 | 0.4871 | 0.2599 | 0.5651 | 0.4374 |
738
+ | 0.8403 | 5200 | 0.0296 | - | - | - | - | - |
739
+ | 0.8727 | 5400 | 0.0194 | - | - | - | - | - |
740
+ | 0.9050 | 5600 | 0.0235 | - | - | - | - | - |
741
+ | 0.9373 | 5800 | 0.0191 | - | - | - | - | - |
742
+ | 0.9696 | 6000 | 0.0173 | 0.0390 | 0.4837 | 0.2866 | 0.5574 | 0.4425 |
743
+ | -1 | -1 | - | - | 0.4842 | 0.2854 | 0.5621 | 0.4439 |
744
+
745
+
746
+ ### Environmental Impact
747
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
748
+ - **Energy Consumed**: 0.094 kWh
749
+ - **Carbon Emitted**: 0.036 kg of CO2
750
+ - **Hours Used**: 0.252 hours
751
+
752
+ ### Training Hardware
753
+ - **On Cloud**: No
754
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
755
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
756
+ - **RAM Size**: 31.78 GB
757
+
758
+ ### Framework Versions
759
+ - Python: 3.11.6
760
+ - Sentence Transformers: 4.2.0.dev0
761
+ - Transformers: 4.52.4
762
+ - PyTorch: 2.6.0+cu124
763
+ - Accelerate: 1.5.1
764
+ - Datasets: 2.21.0
765
+ - Tokenizers: 0.21.1
766
+
767
+ ## Citation
768
+
769
+ ### BibTeX
770
+
771
+ #### Sentence Transformers
772
+ ```bibtex
773
+ @inproceedings{reimers-2019-sentence-bert,
774
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
775
+ author = "Reimers, Nils and Gurevych, Iryna",
776
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
777
+ month = "11",
778
+ year = "2019",
779
+ publisher = "Association for Computational Linguistics",
780
+ url = "https://arxiv.org/abs/1908.10084",
781
+ }
782
+ ```
783
+
784
+ #### SpladeLoss
785
+ ```bibtex
786
+ @misc{formal2022distillationhardnegativesampling,
787
+ title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
788
+ author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
789
+ year={2022},
790
+ eprint={2205.04733},
791
+ archivePrefix={arXiv},
792
+ primaryClass={cs.IR},
793
+ url={https://arxiv.org/abs/2205.04733},
794
+ }
795
+ ```
796
+
797
+ #### SparseMultipleNegativesRankingLoss
798
+ ```bibtex
799
+ @misc{henderson2017efficient,
800
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
801
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
802
+ year={2017},
803
+ eprint={1705.00652},
804
+ archivePrefix={arXiv},
805
+ primaryClass={cs.CL}
806
+ }
807
+ ```
808
+
809
+ #### FlopsLoss
810
+ ```bibtex
811
+ @article{paria2020minimizing,
812
+ title={Minimizing flops to learn efficient sparse representations},
813
+ author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
814
+ journal={arXiv preprint arXiv:2004.05665},
815
+ year={2020}
816
+ }
817
+ ```
818
+
819
+ <!--
820
+ ## Glossary
821
+
822
+ *Clearly define terms in order to be accessible across audiences.*
823
+ -->
824
+
825
+ <!--
826
+ ## Model Card Authors
827
+
828
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
829
+ -->
830
+
831
+ <!--
832
+ ## Model Card Contact
833
+
834
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
835
+ -->
config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation": "gelu",
3
+ "architectures": [
4
+ "DistilBertForMaskedLM"
5
+ ],
6
+ "attention_dropout": 0.1,
7
+ "dim": 768,
8
+ "dropout": 0.1,
9
+ "hidden_dim": 3072,
10
+ "initializer_range": 0.02,
11
+ "max_position_embeddings": 512,
12
+ "model_type": "distilbert",
13
+ "n_heads": 12,
14
+ "n_layers": 6,
15
+ "pad_token_id": 0,
16
+ "qa_dropout": 0.1,
17
+ "seq_classif_dropout": 0.2,
18
+ "sinusoidal_pos_embds": false,
19
+ "tie_weights_": true,
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.52.4",
22
+ "vocab_size": 30522
23
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SparseEncoder",
3
+ "__version__": {
4
+ "sentence_transformers": "4.2.0.dev0",
5
+ "transformers": "4.52.4",
6
+ "pytorch": "2.6.0+cu124"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "dot"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:231e0d3e18c2b8a3325b96beebe83c36ad1c761c54dbcb98868e0ca9c4e56677
3
+ size 267954768
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.sparse_encoder.models.MLMTransformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_SpladePooling",
12
+ "type": "sentence_transformers.sparse_encoder.models.SpladePooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "pad_token": "[PAD]",
51
+ "sep_token": "[SEP]",
52
+ "strip_accents": null,
53
+ "tokenize_chinese_chars": true,
54
+ "tokenizer_class": "DistilBertTokenizer",
55
+ "unk_token": "[UNK]"
56
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff