rasyosef commited on
Commit
220a2b3
·
verified ·
1 Parent(s): e05f7dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -39
README.md CHANGED
@@ -9,21 +9,25 @@ tags:
9
  - loss:SpladeLoss
10
  - loss:SparseMarginMSELoss
11
  - loss:FlopsLoss
12
- base_model: yosefw/SPLADE-BERT-Small-BS256
 
13
  widget:
14
  - text: leagues, define
15
- - text: WATCH HOW YOU WANT. STARZ lets you stream hit original series and movies on
16
- your favorite devices. Plus you can get the STARZ app on your smartphone or tablet
17
- and download full movies and shows to watch off-line, anytime, anywhere. START
18
- YOUR FREE TRIAL NOW.
19
- - text: Furthermore, priority must be given to national jurisdiction. Pointing out
20
- that States applied universal jurisdiction differently, he expressed concern at
21
- the abuse of its application by some national courts, which rendered it a source
22
- of international conflict.
23
- - text: My sil tells me that my mil cooked the eggplant at high heat for a very long
24
- time until it was almost burned. Is it possible that cooking it in such a way
25
- gets rid of the bitterness? My mil bought her eggplants at the chain grocery store-
26
- so this is not a freshness issue. Thanks for any ideas.
 
 
 
27
  - text: how many tablespoons of garlic powder are in an ounce
28
  pipeline_tag: feature-extraction
29
  library_name: sentence-transformers
@@ -114,38 +118,35 @@ model-index:
114
  - type: corpus_sparsity_ratio
115
  value: 0.9943693333766356
116
  name: Corpus Sparsity Ratio
 
 
 
 
 
117
  ---
118
 
119
- # SPLADE Sparse Encoder
120
 
121
- This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [yosefw/SPLADE-BERT-Small-BS256](https://huggingface.co/yosefw/SPLADE-BERT-Small-BS256) using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
122
- ## Model Details
123
 
124
- ### Model Description
125
- - **Model Type:** SPLADE Sparse Encoder
126
- - **Base model:** [yosefw/SPLADE-BERT-Small-BS256](https://huggingface.co/yosefw/SPLADE-BERT-Small-BS256) <!-- at revision 43b8c4a930896cdbab236b2a46fe1b762216df1a -->
127
- - **Maximum Sequence Length:** 512 tokens
128
- - **Output Dimensionality:** 30522 dimensions
129
- - **Similarity Function:** Dot Product
130
- <!-- - **Training Dataset:** Unknown -->
131
- <!-- - **Language:** Unknown -->
132
- <!-- - **License:** Unknown -->
133
 
134
- ### Model Sources
 
 
135
 
136
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
137
- - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
138
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
139
- - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
140
 
141
- ### Full Model Architecture
 
 
 
 
 
 
 
 
142
 
143
- ```
144
- SparseEncoder(
145
- (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
146
- (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
147
- )
148
- ```
149
 
150
  ## Usage
151
 
@@ -162,7 +163,7 @@ Then you can load this model and run inference.
162
  from sentence_transformers import SparseEncoder
163
 
164
  # Download from the 🤗 Hub
165
- model = SparseEncoder("yosefw/SPLADE-BERT-Small-BS256-distil")
166
  # Run inference
167
  queries = [
168
  "how many tablespoons of garlic powder are in an ounce",
@@ -183,6 +184,34 @@ print(similarities)
183
  # tensor([[26.3104, 20.4381, 15.5539]])
184
  ```
185
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
186
  <!--
187
  ### Direct Usage (Transformers)
188
 
@@ -207,6 +236,9 @@ You can finetune this model on your own dataset.
207
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
208
  -->
209
 
 
 
 
210
  ## Evaluation
211
 
212
  ### Metrics
@@ -499,4 +531,5 @@ You can finetune this model on your own dataset.
499
  ## Model Card Contact
500
 
501
  *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
502
- -->
 
 
9
  - loss:SpladeLoss
10
  - loss:SparseMarginMSELoss
11
  - loss:FlopsLoss
12
+ base_model:
13
+ - prajjwal1/bert-small
14
  widget:
15
  - text: leagues, define
16
+ - text: >-
17
+ WATCH HOW YOU WANT. STARZ lets you stream hit original series and movies on
18
+ your favorite devices. Plus you can get the STARZ app on your smartphone or
19
+ tablet and download full movies and shows to watch off-line, anytime,
20
+ anywhere. START YOUR FREE TRIAL NOW.
21
+ - text: >-
22
+ Furthermore, priority must be given to national jurisdiction. Pointing out
23
+ that States applied universal jurisdiction differently, he expressed concern
24
+ at the abuse of its application by some national courts, which rendered it a
25
+ source of international conflict.
26
+ - text: >-
27
+ My sil tells me that my mil cooked the eggplant at high heat for a very long
28
+ time until it was almost burned. Is it possible that cooking it in such a
29
+ way gets rid of the bitterness? My mil bought her eggplants at the chain
30
+ grocery store- so this is not a freshness issue. Thanks for any ideas.
31
  - text: how many tablespoons of garlic powder are in an ounce
32
  pipeline_tag: feature-extraction
33
  library_name: sentence-transformers
 
118
  - type: corpus_sparsity_ratio
119
  value: 0.9943693333766356
120
  name: Corpus Sparsity Ratio
121
+ license: mit
122
+ datasets:
123
+ - microsoft/ms_marco
124
+ language:
125
+ - en
126
  ---
127
 
128
+ # SPLADE-BERT-Small-Distil
129
 
130
+ This is a SPLADE sparse retrieval model based on BERT-Small (29M) that was trained by distilling a Cross-Encoder on the MSMARCO dataset. The cross-encoder used was [ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2).
 
131
 
132
+ This SPLADE model is `2x` smaller than Naver's official `splade-v3-distilbert` while having `91%` of it's performance on the MSMARCO benchmark. This model is small enough to be used without a GPU on a dataset of a few thousand documents.
 
 
 
 
 
 
 
 
133
 
134
+ - `Collection:` https://huggingface.co/collections/rasyosef/splade-tiny-msmarco-687c548c0691d95babf65b70
135
+ - `Distillation Dataset:` https://huggingface.co/datasets/yosefw/msmarco-train-distil-v2
136
+ - `Code:` https://github.com/rasyosef/splade-tiny-msmarco
137
 
138
+ ## Performance
 
 
 
139
 
140
+ The splade models were evaluated on 55 thousand queries and 8.84 million documents from the [MSMARCO](https://huggingface.co/datasets/microsoft/ms_marco) dataset.
141
+
142
+ ||Size (# Params)|MRR@10 (MS MARCO dev)|
143
+ |:---|:----|:-------------------|
144
+ |`BM25`|-|18.0|-|-|
145
+ |`rasyosef/splade-tiny`|4.4M|30.9|
146
+ |`rasyosef/splade-mini`|11.2M|34.1|
147
+ |`rasyosef/splade-small`|28.8M|35.4|
148
+ |`naver/splade-v3-distilbert`|67.0M|38.7|
149
 
 
 
 
 
 
 
150
 
151
  ## Usage
152
 
 
163
  from sentence_transformers import SparseEncoder
164
 
165
  # Download from the 🤗 Hub
166
+ model = SparseEncoder("rasyosef/splade-small")
167
  # Run inference
168
  queries = [
169
  "how many tablespoons of garlic powder are in an ounce",
 
184
  # tensor([[26.3104, 20.4381, 15.5539]])
185
  ```
186
 
187
+ ## Model Details
188
+
189
+ ### Model Description
190
+ - **Model Type:** SPLADE Sparse Encoder
191
+ - **Base model:** [prajjwal1/bert-small](https://huggingface.co/prajjwal1/bert-small)
192
+ - **Maximum Sequence Length:** 512 tokens
193
+ - **Output Dimensionality:** 30522 dimensions
194
+ - **Similarity Function:** Dot Product
195
+ <!-- - **Training Dataset:** Unknown -->
196
+ <!-- - **Language:** Unknown -->
197
+ <!-- - **License:** Unknown -->
198
+
199
+ ### Model Sources
200
+
201
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
202
+ - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
203
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
204
+ - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
205
+
206
+ ### Full Model Architecture
207
+
208
+ ```
209
+ SparseEncoder(
210
+ (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
211
+ (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
212
+ )
213
+ ```
214
+
215
  <!--
216
  ### Direct Usage (Transformers)
217
 
 
236
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
237
  -->
238
 
239
+ ## More
240
+ <details><summary>Click to expand</summary>
241
+
242
  ## Evaluation
243
 
244
  ### Metrics
 
531
  ## Model Card Contact
532
 
533
  *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
534
+ -->
535
+ </details>