tomaarsen HF Staff commited on
Commit
95d2043
·
verified ·
1 Parent(s): 3acabcd

Add new SparseEncoder model

Browse files
1_SpladePooling/config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "pooling_strategy": "max",
3
+ "activation_function": "relu",
4
+ "word_embedding_dimension": 30522
5
+ }
README.md ADDED
@@ -0,0 +1,597 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sparse-encoder
8
+ - sparse
9
+ - splade
10
+ - generated_from_trainer
11
+ - dataset_size:10000
12
+ - loss:SpladeLoss
13
+ - loss:SparseMarginMSELoss
14
+ - loss:FlopsLoss
15
+ widget:
16
+ - text: There are 25.15 miles from Miami to Fort Lauderdale in north direction and
17
+ 29.11 miles (46.85 kilometers) by car, following the I-95 route. Miami and Fort
18
+ Lauderdale are 31 minutes far apart, if you drive non-stop. This is the fastest
19
+ route from Miami, FL to Fort Lauderdale, FL. The halfway point is Aventura, FL.
20
+ - text: Free Universal VIN decoder to check vehicle data and history. This is a universal
21
+ VIN decoder. Every car has a unique identifier code called a VIN. This number
22
+ contains vital information about the car, such as its manufacturer, year of production,
23
+ the plant it was produced in, type of engine, model and more.
24
+ - text: Various vascular tissues in the root allow for transportation of water and
25
+ nutrients to the rest of theplant.Plant cells have a cell wall to provide support,
26
+ a large vacuole for storage of minerals, food, andchloroplasts where photosynthesis
27
+ takes place.
28
+ - text: 'The name Julia is an American baby name. In American the meaning of the name
29
+ Julia is: Youthful. Swedish Meaning: The name Julia is a Swedish baby name. In
30
+ Swedish the meaning of the name Julia is: Youth.Greek Meaning: The name Julia
31
+ is a Greek baby name. In Greek the meaning of the name Julia is: Downy. Hairy.
32
+ Derived from the clan name of Roman dictator Gaius Julius Caesar.Latin Meaning:
33
+ The name Julia is a Latin baby name.In Latin the meaning of the name Julia is:
34
+ Young. The feminine form of Julius. A character in Shakespeare''s play ''Two Gentlemen
35
+ of Verona''. Shakespearean Meaning: The name Julia is a Shakespearean baby name.he
36
+ name Julia is a Latin baby name. In Latin the meaning of the name Julia is: Young.
37
+ The feminine form of Julius. A character in Shakespeare''s play ''Two Gentlemen
38
+ of Verona''.'
39
+ - text: Usually, an LFT blood test measures the amount of bilirubin in the blood.
40
+ Bilirubin is released when red blood cells breakdown, and it is the liver that
41
+ detoxifies the bilirubin and helps to eliminate it from the body. Bilirubin is
42
+ a part of the digestive juice, bile, which the liver produces.
43
+ datasets:
44
+ - tomaarsen/msmarco-Qwen3-Reranker-0.6B
45
+ pipeline_tag: feature-extraction
46
+ library_name: sentence-transformers
47
+ metrics:
48
+ - dot_accuracy@1
49
+ - dot_accuracy@3
50
+ - dot_accuracy@5
51
+ - dot_accuracy@10
52
+ - dot_precision@1
53
+ - dot_precision@3
54
+ - dot_precision@5
55
+ - dot_precision@10
56
+ - dot_recall@1
57
+ - dot_recall@3
58
+ - dot_recall@5
59
+ - dot_recall@10
60
+ - dot_ndcg@10
61
+ - dot_mrr@10
62
+ - dot_map@100
63
+ - query_active_dims
64
+ - query_sparsity_ratio
65
+ - corpus_active_dims
66
+ - corpus_sparsity_ratio
67
+ co2_eq_emissions:
68
+ emissions: 32.851387284675674
69
+ energy_consumed: 0.08451561166311383
70
+ source: codecarbon
71
+ training_type: fine-tuning
72
+ on_cloud: false
73
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
74
+ ram_total_size: 31.777088165283203
75
+ hours_used: 0.274
76
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
77
+ model-index:
78
+ - name: splade-co-condenser-marco-greedy trained on MS MARCO hard negatives with distillation
79
+ results:
80
+ - task:
81
+ type: sparse-information-retrieval
82
+ name: Sparse Information Retrieval
83
+ dataset:
84
+ name: msmarco eval 1kq 1kd
85
+ type: msmarco-eval-1kq-1kd
86
+ metrics:
87
+ - type: dot_accuracy@1
88
+ value: 0.951
89
+ name: Dot Accuracy@1
90
+ - type: dot_accuracy@3
91
+ value: 0.985
92
+ name: Dot Accuracy@3
93
+ - type: dot_accuracy@5
94
+ value: 0.989
95
+ name: Dot Accuracy@5
96
+ - type: dot_accuracy@10
97
+ value: 0.992
98
+ name: Dot Accuracy@10
99
+ - type: dot_precision@1
100
+ value: 0.951
101
+ name: Dot Precision@1
102
+ - type: dot_precision@3
103
+ value: 0.32833333333333325
104
+ name: Dot Precision@3
105
+ - type: dot_precision@5
106
+ value: 0.19780000000000003
107
+ name: Dot Precision@5
108
+ - type: dot_precision@10
109
+ value: 0.09920000000000001
110
+ name: Dot Precision@10
111
+ - type: dot_recall@1
112
+ value: 0.951
113
+ name: Dot Recall@1
114
+ - type: dot_recall@3
115
+ value: 0.985
116
+ name: Dot Recall@3
117
+ - type: dot_recall@5
118
+ value: 0.989
119
+ name: Dot Recall@5
120
+ - type: dot_recall@10
121
+ value: 0.992
122
+ name: Dot Recall@10
123
+ - type: dot_ndcg@10
124
+ value: 0.9744135371427797
125
+ name: Dot Ndcg@10
126
+ - type: dot_mrr@10
127
+ value: 0.9684499999999999
128
+ name: Dot Mrr@10
129
+ - type: dot_map@100
130
+ value: 0.9686698065770335
131
+ name: Dot Map@100
132
+ - type: query_active_dims
133
+ value: 21.079999923706055
134
+ name: Query Active Dims
135
+ - type: query_sparsity_ratio
136
+ value: 0.9993093506348304
137
+ name: Query Sparsity Ratio
138
+ - type: corpus_active_dims
139
+ value: 108.0469970703125
140
+ name: Corpus Active Dims
141
+ - type: corpus_sparsity_ratio
142
+ value: 0.9964600289276485
143
+ name: Corpus Sparsity Ratio
144
+ ---
145
+
146
+ # splade-co-condenser-marco-greedy trained on MS MARCO hard negatives with distillation
147
+
148
+ This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model trained on the [msmarco-qwen3-reranker-0.6_b](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
149
+ ## Model Details
150
+
151
+ ### Model Description
152
+ - **Model Type:** SPLADE Sparse Encoder
153
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
154
+ - **Maximum Sequence Length:** 256 tokens
155
+ - **Output Dimensionality:** 30522 dimensions
156
+ - **Similarity Function:** Dot Product
157
+ - **Training Dataset:**
158
+ - [msmarco-qwen3-reranker-0.6_b](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B)
159
+ - **Language:** en
160
+ - **License:** apache-2.0
161
+
162
+ ### Model Sources
163
+
164
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
165
+ - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
166
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
167
+ - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
168
+
169
+ ### Full Model Architecture
170
+
171
+ ```
172
+ SparseEncoder(
173
+ (0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
174
+ (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
175
+ )
176
+ ```
177
+
178
+ ## Usage
179
+
180
+ ### Direct Usage (Sentence Transformers)
181
+
182
+ First install the Sentence Transformers library:
183
+
184
+ ```bash
185
+ pip install -U sentence-transformers
186
+ ```
187
+
188
+ Then you can load this model and run inference.
189
+ ```python
190
+ from sentence_transformers import SparseEncoder
191
+
192
+ # Download from the 🤗 Hub
193
+ model = SparseEncoder("tomaarsen/splade-co-condenser-marco-greedy-msmarco-hard-negatives-v5")
194
+ # Run inference
195
+ queries = [
196
+ "what does ly mean in a blood test",
197
+ ]
198
+ documents = [
199
+ 'According to the Hormone-Refractory Prostate Cancer Association, LY on a blood test stands for lymphocytes. The number in the results represents the percentage of lymphocytes in the white blood count. Lymphocytes should count for 15 to 46.8 percent of white blood cells. Continue Reading.',
200
+ "FROM OUR COMMUNITY. Hi Terry, The LY (Lymphocytes) in your blood test is; the type of white blood cell found in the blood and lymph systems; part of the immune system. BUN/CREAT - Bun and Creatinine are tests done to monitor kidney function. I'm sorry, but I've never heard of the other 2.",
201
+ 'FROM OUR EXPERTS. Trace lysed blood refers to a finding that is usually reported from a urinary dip stick analysis. It implies that there is a small quantity of red cells in the urine that have broken open. The developer on the dip stick reacts with the hemoglobin that is released when the red cells are lysed.',
202
+ ]
203
+ query_embeddings = model.encode_query(queries)
204
+ document_embeddings = model.encode_document(documents)
205
+ print(query_embeddings.shape, document_embeddings.shape)
206
+ # [1, 30522] [3, 30522]
207
+
208
+ # Get the similarity scores for the embeddings
209
+ similarities = model.similarity(query_embeddings, document_embeddings)
210
+ print(similarities)
211
+ # tensor([[10.1186, 9.8597, 9.8374]])
212
+ ```
213
+
214
+ <!--
215
+ ### Direct Usage (Transformers)
216
+
217
+ <details><summary>Click to see the direct usage in Transformers</summary>
218
+
219
+ </details>
220
+ -->
221
+
222
+ <!--
223
+ ### Downstream Usage (Sentence Transformers)
224
+
225
+ You can finetune this model on your own dataset.
226
+
227
+ <details><summary>Click to expand</summary>
228
+
229
+ </details>
230
+ -->
231
+
232
+ <!--
233
+ ### Out-of-Scope Use
234
+
235
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
236
+ -->
237
+
238
+ ## Evaluation
239
+
240
+ ### Metrics
241
+
242
+ #### Sparse Information Retrieval
243
+
244
+ * Dataset: `msmarco-eval-1kq-1kd`
245
+ * Evaluated with [<code>SparseInformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator)
246
+
247
+ | Metric | Value |
248
+ |:----------------------|:-----------|
249
+ | dot_accuracy@1 | 0.951 |
250
+ | dot_accuracy@3 | 0.985 |
251
+ | dot_accuracy@5 | 0.989 |
252
+ | dot_accuracy@10 | 0.992 |
253
+ | dot_precision@1 | 0.951 |
254
+ | dot_precision@3 | 0.3283 |
255
+ | dot_precision@5 | 0.1978 |
256
+ | dot_precision@10 | 0.0992 |
257
+ | dot_recall@1 | 0.951 |
258
+ | dot_recall@3 | 0.985 |
259
+ | dot_recall@5 | 0.989 |
260
+ | dot_recall@10 | 0.992 |
261
+ | **dot_ndcg@10** | **0.9744** |
262
+ | dot_mrr@10 | 0.9684 |
263
+ | dot_map@100 | 0.9687 |
264
+ | query_active_dims | 21.08 |
265
+ | query_sparsity_ratio | 0.9993 |
266
+ | corpus_active_dims | 108.047 |
267
+ | corpus_sparsity_ratio | 0.9965 |
268
+
269
+ <!--
270
+ ## Bias, Risks and Limitations
271
+
272
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
273
+ -->
274
+
275
+ <!--
276
+ ### Recommendations
277
+
278
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
279
+ -->
280
+
281
+ ## Training Details
282
+
283
+ ### Training Dataset
284
+
285
+ #### msmarco-qwen3-reranker-0.6_b
286
+
287
+ * Dataset: [msmarco-qwen3-reranker-0.6_b](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B) at [20c25c8](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B/tree/20c25c858f80ba96bdb58f1558746e077001303a)
288
+ * Size: 10,000 training samples
289
+ * Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, <code>negative_5</code>, <code>negative_6</code>, <code>negative_7</code>, <code>negative_8</code>, and <code>score</code>
290
+ * Approximate statistics based on the first 1000 samples:
291
+ | | query | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | score |
292
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------|
293
+ | type | string | string | string | string | string | string | string | string | string | string | list |
294
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.18 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 80.31 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 68.3 tokens</li><li>max: 197 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 70.27 tokens</li><li>max: 209 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 70.16 tokens</li><li>max: 241 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 71.17 tokens</li><li>max: 211 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 71.8 tokens</li><li>max: 190 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 72.04 tokens</li><li>max: 194 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 73.0 tokens</li><li>max: 203 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 71.02 tokens</li><li>max: 206 tokens</li></ul> | <ul><li>size: 9 elements</li></ul> |
295
+ * Samples:
296
+ | query | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | score |
297
+ |:-------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|
298
+ | <code>what is clomiphene</code> | <code>Uses of This Medicine. Clomiphene is used as a fertility medicine in some women who are unable to become pregnant. Clomiphene probably works by changing the hormone balance of the body. In women, this causes ovulation to occur and prepares the body for pregnancy.ses of This Medicine. Clomiphene is used as a fertility medicine in some women who are unable to become pregnant. Clomiphene probably works by changing the hormone balance of the body. In women, this causes ovulation to occur and prepares the body for pregnancy.</code> | <code>Clomiphene citrate, a synthetic hormone commonly used to induce or regulate ovulation, is the most often prescribed fertility pill. Brand names for clomiphene citrate include Clomid and Serophene. Clomiphene works indirectly to stimulate ovulation.</code> | <code>Occasionally, clomiphene can stimulate the ovaries too much, causing multiple eggs to be released, which can result in multiple births, such as twins or triplets (see Clomid and Twins) . Clomiphene is one of the least expensive and easiest-to-use fertility drugs. However, it will not work for all types of infertility. Your healthcare provider needs to try to find your cause of infertility before you try clomiphene.</code> | <code>Clomiphene Citrate offers two benefits to the performance enhancing athlete with one being primary. Most commonly, this SERM is used for post cycle recovery purposes; specifically to stimulate natural testosterone production that has been suppressed due to the use of anabolic steroids.</code> | <code>PCOS and ovulation problems and Clomid treatment. Clomid (clomiphene citrate or Serophene) is an oral medication that is commonly used for the treatment of infertility. It is often given to try to induce ovulation in women that do not develop and release an egg (ovulate) on their own.</code> | <code>Indication: Clomid (clomiphene citrate) is often the first choice for treating infertility, because it's effective and been used for more than 40 years.</code> | <code>Clomid Description. 1 Clomid (clomiphene citrate tablets USP) is an orally administered, nonsteroidal, ovulatory stimulant designated chemically as 2-[p-(2-chloro-1,2-diphenylvinyl)phenoxy] triethylamine citrate (1:1). It has the molecular formula of C26H28ClNO • C6H8O7 and a molecular weight of 598.09.</code> | <code>PCOS and ovulation problems and Clomid treatment. Clomid (clomiphene citrate or Serophene) is an oral medication that is commonly used for the treatment of infertility. 1 It is often given to try to induce ovulation in women that do not develop and release an egg (ovulate) on their own. Clomid is started early in the menstrual cycle and is taken for five days either from cycle days 3 through 7, or from day 5 through 9. 2 Clomid is usually started at a dose of one tablet (50mg) daily-taken any time of day.</code> | <code>Clomid is taken as a pill. This is unlike the stronger fertility drugs, which require injection. Clomid is also very effective, stimulating ovulation 80 percent of the time. Clomid may also be marketed under the name Serophene, or you may see it sold under its generic name, clomiphene citrate. Note: Clomid can also be used as a treatment for male infertility. This article focuses on Clomid treatment in women.</code> | <code>[4.75390625, 6.9375, 3.92578125, 1.0400390625, 5.61328125, ...]</code> |
299
+ | <code>typical accountant cost for it contractor</code> | <code>In the current market, we’ve seen rates as low as £50 +VAT, and as high as £180 +VAT for dedicated contractor accountants. Interestingly, the average cost of contractor accounting has not risen in line with inflation over the past decade.</code> | <code>So, how much does a contractor cost, anywhere from 5% to 25% of the total project cost, with the average ranging 10-15%.ypically the contractor' s crew will be general carpentry trades people, some who may have more specialized skills. Exactly how a general contractor charges for a project depends on the type of contract you agree to. There are three common types of cost contracts, fixed price, time & materials and cost plus a fee.</code> | <code>1 Accountants charge $150-$400 or more an hour, depending on the type of work, the size of the firm and its location. 2 You'll pay lower rates for routine work done by a less-experienced associate or lesser-trained employee, such as $30-$50 for bookkeeping services. 3 An accountant's total fee depends on the project. For a simple start-up, expect a minimum of 0.5-1.5 hours of consultation ($75-$600) to go over your business structure and basic tax issues.</code> | <code>So, how much does a contractor cost, anywhere from 5% to 25% of the total project cost, with the average ranging 10-15%.xactly how a general contractor charges for a project depends on the type of contract you agree to. There are three common types of cost contracts, fixed price, time & materials and cost plus a fee. Each contract type has pros and cons for both the consumer and for the contractor.</code> | <code>1 Accountants charge $150-$400 or more an hour, depending on the type of work, the size of the firm and its location. 2 You'll pay lower rates for routine work done by a less-experienced associate or lesser-trained employee, such as $30-$50 for bookkeeping services. 3 An accountant's total fee depends on the project.</code> | <code>average data entry keystrokes per hour salaries the average salary for data entry keystrokes per hour jobs is $ 20000</code> | <code>Accounting services are typically $250 to $400 per month, or $350 to $500 per quarter. Sales tax and bank recs included. We do all the processing, filing and tax deposits. 5 employees, bi-weekly payroll, direct deposit, $135 per month.</code> | <code>The less that is outsourced, the cheaper it will be for you. A bookkeeper should be paid between $15 and $18 per hour. An accountant with a undergraduate degree (4-years) should be paid somewhere around $20/hour but that still depends on what you're having them do. An accountant with a graduate degree (masters) should be paid between $25 and $30 per hour.</code> | <code>Pay by Experience Level for Intelligence Analyst. Median of all compensation (including tips, bonus, and overtime) by years of experience. Intelligence Analysts with a lot of experience tend to enjoy higher earnings.</code> | <code>[7.44921875, 3.271484375, 5.859375, 3.234375, 5.421875, ...]</code> |
300
+ | <code>what is mch on a blood test</code> | <code>What High Levels Mean. MCH levels in blood tests are considered high if they are 35 or higher. A normal hemoglobin level is considered to be in the range between 26 and 33 picograms per red blood cell. High MCH levels can indicate macrocytic anemia, which can be caused by insufficient vitamin B12.acrocytic RBCs are large so tend to have a higher MCH, while microcytic red cells would have a lower value.”. MCH is one of three red blood cell indices (MCHC and MCV are the other two). The measurements are done by machine and can help with diagnosis of medical problems.</code> | <code>MCH stands for mean corpuscular hemoglobin. It estimates the average amount of hemoglobin in each red blood cell, measured in picograms (a trillionth of a gram). Automated cell counters calculate the MCH, which is reported as part of a complete blood count (CBC) test. MCH may be low in iron-deficiency anemia, and may be high in anemia due to vitamin B12 or folate deficiency. Other forms of anemia can also cause MCH to be abnormal. Doctors only use the MCH as supporting information, not to make a diagnosis.</code> | <code>A. MCH stands for mean corpuscular hemoglobin. It estimates the average amount of hemoglobin in each red blood cell, measured in picograms (a trillionth of a gram). Automated cell counters calculate the MCH, which is reported as part of a complete blood count (CBC) test. MCH may be low in iron-deficiency anemia, and may be high in anemia due to vitamin B12 or folate deficiency. Other forms of anemia can also cause MCH to be abnormal.</code> | <code>The test used to determine the quantity of hemoglobin in the blood is known as the MCH blood test. The full form of MCH is Mean Corpuscular Hemoglobin. This test is therefore used to determine the average amount of hemoglobin per red blood cell in the body. The results of the MCH blood test are therefore reported in picograms, a tiny measure of weight.</code> | <code>MCH blood test high indicates that there is a poor supply of oxygen to the blood where as MCH blood test low mean that hemoglobin is too little in the cells indicating a lack of iron. It is important that iron is maintained at a certain level as too much or too little iron can be dangerous to your body.</code> | <code>slide 1 of 7. What Is MCH? MCH is the initialism for Mean Corpuscular Hemoglobin. Taken from Latin, the term refers to the average amount of hemoglobin found in red blood cells. A CBC (complete blood count) blood test can be used to monitor MCH levels in the blood. Lab Tests Online explains that the MCH aspect of a CBC test “is a measurement of the average amount of oxygen-carrying hemoglobin inside a red blood cell. Macrocytic RBCs are large so tend to have a higher MCH, while microcytic red cells would have a lower value..</code> | <code>The test used to determine the quantity of hemoglobin in the blood is known as the MCH blood test. The full form of MCH is Mean Corpuscular Hemoglobin. This test is therefore used to determine the average amount of hemoglobin per red blood cell in the body. The results of the MCH blood test are therefore reported in picograms, a tiny measure of weight. The normal range of the MCH blood test is between 26 and 33 pg per cell.</code> | <code>A MCHC test is a test that is carried out to test a person for anemia. The MCHC in a MCHC test stands for Mean Corpuscular Hemoglobin Concentration. MCHC is the calculation of the average hemoglobin inside a red blood cell. A MCHC test can be performed along with a MCV test (Mean Corpuscular Volume).Both levels are used to test people for anemia.The MCHC test is also known as the MCH blood test which tests the levels of hemoglobin in the blood. The MCHC test can be ordered as part of a complete blood count (CBC) test.CHC is measured in grams per deciliter. Normal readings for MCHC are 31 grams per deciliter to 35 grams per deciliter. A MCHC blood test may be ordered when a person is showing signs of fatigue or weakness, when there is an infection, is bleeding or bruising easily or when there is an inflammation.</code> | <code>The test looks at the average amount of hemoglobin per red cell. So MCHC = the amount of hemoglobin present in each red blood cell. A MCHC blood test could be ordered for someone who has signs of fatigue or weakness, when there is an infection, is bleeding or bruising easily or when there is noticeable inflammation.</code> | <code>[6.44921875, 7.05078125, 7.2109375, 8.40625, 6.53515625, ...]</code> |
301
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
302
+ ```json
303
+ {
304
+ "loss": "SparseMarginMSELoss",
305
+ "document_regularizer_weight": 0.08,
306
+ "query_regularizer_weight": 0.1
307
+ }
308
+ ```
309
+
310
+ ### Evaluation Dataset
311
+
312
+ #### msmarco-qwen3-reranker-0.6_b
313
+
314
+ * Dataset: [msmarco-qwen3-reranker-0.6_b](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B) at [20c25c8](https://huggingface.co/datasets/tomaarsen/msmarco-Qwen3-Reranker-0.6B/tree/20c25c858f80ba96bdb58f1558746e077001303a)
315
+ * Size: 1,000 evaluation samples
316
+ * Columns: <code>query</code>, <code>positive</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, <code>negative_5</code>, <code>negative_6</code>, <code>negative_7</code>, <code>negative_8</code>, and <code>score</code>
317
+ * Approximate statistics based on the first 1000 samples:
318
+ | | query | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | score |
319
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------|
320
+ | type | string | string | string | string | string | string | string | string | string | string | list |
321
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.05 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 20 tokens</li><li>mean: 81.61 tokens</li><li>max: 244 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 69.2 tokens</li><li>max: 231 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 68.76 tokens</li><li>max: 198 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 70.99 tokens</li><li>max: 225 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 70.7 tokens</li><li>max: 236 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 72.51 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 68.95 tokens</li><li>max: 203 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 71.68 tokens</li><li>max: 220 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 70.18 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>size: 9 elements</li></ul> |
322
+ * Samples:
323
+ | query | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | score |
324
+ |:-------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|
325
+ | <code>how many people employed by shell</code> | <code>Shell worldwide. Royal Dutch Shell was formed in 1907, although our history dates back to the early 19th century, to a small shop in London where the Samuel family sold sea shells. Today, Shell is one of the world’s major energy companies, employing an average of 93,000 people and operating in more than 70 countries. Our headquarters are in The Hague, the Netherlands, and our Chief Executive Officer is Ben van Beurden.</code> | <code>Show sources information. This statistic shows the number of employees at SeaWorld Entertainment, Inc. in the United States, by type. As of December 2016, SeaWorld employed 5,000 full-time employees and counted approximately 13,000 seasonal employees during their peak operating season.</code> | <code>Jobs, companies, people, and articles for LinkedIn’s Payroll Specialist - Addus Homecare, Inc. members. Insights about Payroll Specialist - Addus Homecare, Inc. members on LinkedIn. Median salary $31,300.</code> | <code>As of July 2014, there are 139 million people employed in the United States. This number is up by 209,000 employees from June and by 1.47 million from the beginning of 2014.</code> | <code>average data entry keystrokes per hour salaries the average salary for data entry keystrokes per hour jobs is $ 20000</code> | <code>Research and review Plano Synergy jobs. Learn more about a career with Plano Synergy including all recent jobs, hiring trends, salaries, work environment and more. Find Jobs Company Reviews Find Salaries Find Resumes Employers / Post Job Upload your resume Sign in</code> | <code>From millions of real job salary data. 13 Customer Support Specialist salary data. Average Customer Support Specialist salary is $59,032 Detailed Customer Support Specialist starting salary, median salary, pay scale, bonus data report Register & Know how much $ you can earn | Sign In</code> | <code>From millions of real job salary data. 1 Ceo Ally salary data. Average Ceo Ally salary is $55,000 Detailed Ceo Ally starting salary, median salary, pay scale, bonus data report</code> | <code>HelpSystems benefits and perks, including insurance benefits, retirement benefits, and vacation policy. Reported anonymously by HelpSystems employees. Glassdoor uses cookies to improve your site experience.</code> | <code>[6.265625, -1.3671875, -6.91796875, 1.111328125, -7.96875, ...]</code> |
326
+ | <code>what is a lcsw</code> | <code>LCSW is an acronym for licensed clinical social worker, and people with this title are skilled professionals who meet certain requirements and work in a variety of fields. The term social worker is not always synonymous with licensed clinical social worker.</code> | <code>LISW means the person is a Licensed Independent Social Worker. LCSW means the person is a Licensed Clinical Social Worker. Source(s): Introduction to Social Work 101 at University of Nevada, Las Vega (UNLV) Dorothy K. · 1 decade ago.</code> | <code>An LCSW is a licensed clinical social worker. A LMHC is the newest addition to the field of mental health. They are highly similar and can do most of the same things with few exceptions. One thing to keep in mind is that because the LMHC lincense is so new, there are fewer in number in the field.n LCSW is a licensed clinical social worker. A LMHC is the newest addition to the field of mental health. They are highly similar and can do most of the same things with few exceptions. One thing to keep in mind is that because the LMHC lincense is so new, there are fewer in number in the field.</code> | <code>The Licensed Clinical Social Worker or LCSW, is a sub-sector within the field of Social Work. They work with clients in order to help them deal with issues involving their mental and emotional health. This could be related to substance abuse, past trauma or mental illness.</code> | <code>Licensed Clinical Social Worker | LCSW. The Licensed Clinical Social Worker or LCSW, is a sub-sector within the field of Social Work. LCSW's work with clients in order to help deal with issues involving mental and emotional health. There are a wide variety of specializations the Licensed Clinical Social Worker can focus on.</code> | <code>The LMSW exam is a computer-based test containing 170 multiple-choice questions designed to measure minimum competencies in four categories of social work practice: Human development, diversity, and behavior in the environment. Assessment and intervention planning.</code> | <code>The Licensed Clinical Social Worker, also known as the LCSW, is a branch of social work that specializes in mental health therapy in a counseling format. Becoming an LCSW requires a significant degree of training, including having earned a Master of Social Work (MSW) degree from a Council on Social Work Education (CSWE) accredited program.</code> | <code>a. The examination requirements for licensure as an LCSW include passing the Clinical Examination of the ASWB or the Clinical Social Workers Examination of the State of California. Scope of practice-Limitations. a.To the extent they are prepared through education and training, an LCSW can engage in all acts and practices defined as the practice of clinical social work. Certified Social Work (CSW): CSW means a licensed certified social worker. A CSW must have a master s degree.</code> | <code>The LTCM Client is a way for companies to stay in touch with you, their customers, in a way that is unobtrusive and completely under the users' control. It's an application that runs quietly on the computer. Users can and should customize the client to match their desired preferences.</code> | <code>[7.34375, 6.046875, 7.09765625, 6.46484375, 7.28515625, ...]</code> |
327
+ | <code>does oolong tea have much caffeine?</code> | <code>At a given weight, tea contains more caffeine than coffee, but this doesn’t mean that a usual portion of tea contains more caffeine than coffee because tea is usually brewed in a weak way. Some kinds of tea, such as oolong and black tea, contain higher level of caffeine than most other teas. Among six basic teas (green, black, yellow, white, oolong, dark), green tea contains less caffeine than black tea and white tea contains less than green tea. But many studies found that the caffeine content varies more among individual teas than it does among broad categories.</code> | <code>Actually, oolong tea has less caffeine than coffee and black tea. A cup of oolong tea only has about 1/3 of caffeine of a cup of coffee. According to a research conducted by HICKS M.B, the caffeine decreases whenever the tea leaves go through the process of brewing.</code> | <code>Oolong tea contains caffeine. Caffeine works by stimulating the central nervous system (CNS), heart, and muscles. Oolong tea also contains theophylline and theobromine, which are chemicals similar to caffeine. Too much oolong tea, more than five cups per day, can cause side effects because of the caffeine.</code> | <code>Oolong tea, made from more mature leaves, usually have less caffeine than green tea. On the flip side, mature leaves contain less theanine, a sweet, natural relaxant that makes a tea much less caffeinated than it actually is. That is the theory, anyway.</code> | <code>Oolong tea is a product made from the leaves, buds, and stems of the Camellia sinensis plant. This is the same plant that is also used to make black tea and green tea. The difference is in the processing.Oolong tea is partially fermented, black tea is fully fermented, and green tea is unfermented. Oolong tea is used to sharpen thinking skills and improve mental alertness. It is also used to prevent cancer, tooth decay, osteoporosis, and heart disease.owever, do not drink more than 2 cups a day of oolong tea. That amount of tea contains about 200 mg of caffeine. Too much caffeine during pregnancy might cause premature delivery, low birth weight, and harm to the baby.</code> | <code>A Department of Nutritional Services report provides the following ranges of caffeine content for a cup of tea made with loose leaves: 1 Black Tea: 23 - 110 mg. 2 Oolong Tea: 12 - 55 mg. Green Tea: 8 - 36 mg.</code> | <code>Oolong tea is a product made from the leaves, buds, and stems of the Camellia sinensis plant. This is the same plant that is also used to make black tea and green tea. The difference is in the processing. Oolong tea is partially fermented, black tea is fully fermented, and green tea is unfermented. Oolong tea is used to sharpen thinking skills and improve mental alertness. It is also used to prevent cancer, tooth decay, osteoporosis, and heart disease.</code> | <code>Health Effects of Tea – Caffeine. In dry form, a kilogram of black tea has twice the caffeine as a kilogram of coffee…. But one kilogram of black tea makes about 450 cups of tea and one kilogram of coffee makes about 100 cups of coffee, so…. There is less caffeine in a cup of tea than in a cup of coffee. Green teas have less caffeine than black teas, and white teas have even less caffeine than green teas. Oolong teas fall between black and green teas. Herbal tea, because it is not made from the same tea plant, is caffeine-free, naturally. Here is a graphical representation of their respective caffeine content.</code> | <code>The average 8-ounce serving of brewed black tea contains 14 to 70 mg of caffeine. This compares to 24 to 45 mg of caffeine found in green tea. An 8-ounce glass of instant iced tea prepared with water contains 11 to 47 mg of caffeine. Most ready-to-drink bottled teas contain 5 to 40 mg of caffeine. Just as with coffee, decaffeinated tea still contains 5 to 10 mg of caffeine per cup.</code> | <code>[7.60546875, 8.78125, 9.109375, 8.609375, 7.984375, ...]</code> |
328
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
329
+ ```json
330
+ {
331
+ "loss": "SparseMarginMSELoss",
332
+ "document_regularizer_weight": 0.08,
333
+ "query_regularizer_weight": 0.1
334
+ }
335
+ ```
336
+
337
+ ### Training Hyperparameters
338
+ #### Non-Default Hyperparameters
339
+
340
+ - `eval_strategy`: steps
341
+ - `learning_rate`: 4e-05
342
+ - `num_train_epochs`: 1
343
+ - `warmup_ratio`: 0.1
344
+ - `bf16`: True
345
+ - `load_best_model_at_end`: True
346
+
347
+ #### All Hyperparameters
348
+ <details><summary>Click to expand</summary>
349
+
350
+ - `overwrite_output_dir`: False
351
+ - `do_predict`: False
352
+ - `eval_strategy`: steps
353
+ - `prediction_loss_only`: True
354
+ - `per_device_train_batch_size`: 8
355
+ - `per_device_eval_batch_size`: 8
356
+ - `per_gpu_train_batch_size`: None
357
+ - `per_gpu_eval_batch_size`: None
358
+ - `gradient_accumulation_steps`: 1
359
+ - `eval_accumulation_steps`: None
360
+ - `torch_empty_cache_steps`: None
361
+ - `learning_rate`: 4e-05
362
+ - `weight_decay`: 0.0
363
+ - `adam_beta1`: 0.9
364
+ - `adam_beta2`: 0.999
365
+ - `adam_epsilon`: 1e-08
366
+ - `max_grad_norm`: 1.0
367
+ - `num_train_epochs`: 1
368
+ - `max_steps`: -1
369
+ - `lr_scheduler_type`: linear
370
+ - `lr_scheduler_kwargs`: {}
371
+ - `warmup_ratio`: 0.1
372
+ - `warmup_steps`: 0
373
+ - `log_level`: passive
374
+ - `log_level_replica`: warning
375
+ - `log_on_each_node`: True
376
+ - `logging_nan_inf_filter`: True
377
+ - `save_safetensors`: True
378
+ - `save_on_each_node`: False
379
+ - `save_only_model`: False
380
+ - `restore_callback_states_from_checkpoint`: False
381
+ - `no_cuda`: False
382
+ - `use_cpu`: False
383
+ - `use_mps_device`: False
384
+ - `seed`: 42
385
+ - `data_seed`: None
386
+ - `jit_mode_eval`: False
387
+ - `use_ipex`: False
388
+ - `bf16`: True
389
+ - `fp16`: False
390
+ - `fp16_opt_level`: O1
391
+ - `half_precision_backend`: auto
392
+ - `bf16_full_eval`: False
393
+ - `fp16_full_eval`: False
394
+ - `tf32`: None
395
+ - `local_rank`: 0
396
+ - `ddp_backend`: None
397
+ - `tpu_num_cores`: None
398
+ - `tpu_metrics_debug`: False
399
+ - `debug`: []
400
+ - `dataloader_drop_last`: False
401
+ - `dataloader_num_workers`: 0
402
+ - `dataloader_prefetch_factor`: None
403
+ - `past_index`: -1
404
+ - `disable_tqdm`: False
405
+ - `remove_unused_columns`: True
406
+ - `label_names`: None
407
+ - `load_best_model_at_end`: True
408
+ - `ignore_data_skip`: False
409
+ - `fsdp`: []
410
+ - `fsdp_min_num_params`: 0
411
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
412
+ - `fsdp_transformer_layer_cls_to_wrap`: None
413
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
414
+ - `deepspeed`: None
415
+ - `label_smoothing_factor`: 0.0
416
+ - `optim`: adamw_torch
417
+ - `optim_args`: None
418
+ - `adafactor`: False
419
+ - `group_by_length`: False
420
+ - `length_column_name`: length
421
+ - `ddp_find_unused_parameters`: None
422
+ - `ddp_bucket_cap_mb`: None
423
+ - `ddp_broadcast_buffers`: False
424
+ - `dataloader_pin_memory`: True
425
+ - `dataloader_persistent_workers`: False
426
+ - `skip_memory_metrics`: True
427
+ - `use_legacy_prediction_loop`: False
428
+ - `push_to_hub`: False
429
+ - `resume_from_checkpoint`: None
430
+ - `hub_model_id`: None
431
+ - `hub_strategy`: every_save
432
+ - `hub_private_repo`: None
433
+ - `hub_always_push`: False
434
+ - `hub_revision`: None
435
+ - `gradient_checkpointing`: False
436
+ - `gradient_checkpointing_kwargs`: None
437
+ - `include_inputs_for_metrics`: False
438
+ - `include_for_metrics`: []
439
+ - `eval_do_concat_batches`: True
440
+ - `fp16_backend`: auto
441
+ - `push_to_hub_model_id`: None
442
+ - `push_to_hub_organization`: None
443
+ - `mp_parameters`:
444
+ - `auto_find_batch_size`: False
445
+ - `full_determinism`: False
446
+ - `torchdynamo`: None
447
+ - `ray_scope`: last
448
+ - `ddp_timeout`: 1800
449
+ - `torch_compile`: False
450
+ - `torch_compile_backend`: None
451
+ - `torch_compile_mode`: None
452
+ - `include_tokens_per_second`: False
453
+ - `include_num_input_tokens_seen`: False
454
+ - `neftune_noise_alpha`: None
455
+ - `optim_target_modules`: None
456
+ - `batch_eval_metrics`: False
457
+ - `eval_on_start`: False
458
+ - `use_liger_kernel`: False
459
+ - `liger_kernel_config`: None
460
+ - `eval_use_gather_object`: False
461
+ - `average_tokens_across_devices`: False
462
+ - `prompts`: None
463
+ - `batch_sampler`: batch_sampler
464
+ - `multi_dataset_batch_sampler`: proportional
465
+ - `router_mapping`: {}
466
+ - `learning_rate_mapping`: {}
467
+
468
+ </details>
469
+
470
+ ### Training Logs
471
+ | Epoch | Step | Training Loss | Validation Loss | msmarco-eval-1kq-1kd_dot_ndcg@10 |
472
+ |:--------:|:--------:|:-------------:|:---------------:|:--------------------------------:|
473
+ | 0.032 | 40 | 528227.75 | - | - |
474
+ | 0.064 | 80 | 344.8533 | - | - |
475
+ | 0.096 | 120 | 37.9373 | - | - |
476
+ | 0.128 | 160 | 27.212 | - | - |
477
+ | 0.16 | 200 | 23.5608 | 34.3980 | 0.8927 |
478
+ | 0.192 | 240 | 20.3595 | - | - |
479
+ | 0.224 | 280 | 17.452 | - | - |
480
+ | 0.256 | 320 | 18.7535 | - | - |
481
+ | 0.288 | 360 | 17.4456 | - | - |
482
+ | 0.32 | 400 | 20.1141 | 17.0940 | 0.9421 |
483
+ | 0.352 | 440 | 17.7786 | - | - |
484
+ | 0.384 | 480 | 17.3557 | - | - |
485
+ | 0.416 | 520 | 15.705 | - | - |
486
+ | 0.448 | 560 | 15.4653 | - | - |
487
+ | 0.48 | 600 | 17.6259 | 22.5805 | 0.9658 |
488
+ | 0.512 | 640 | 16.5805 | - | - |
489
+ | 0.544 | 680 | 16.7836 | - | - |
490
+ | 0.576 | 720 | 14.8795 | - | - |
491
+ | 0.608 | 760 | 14.4493 | - | - |
492
+ | 0.64 | 800 | 16.3067 | 13.9569 | 0.9627 |
493
+ | 0.672 | 840 | 16.0679 | - | - |
494
+ | 0.704 | 880 | 14.6039 | - | - |
495
+ | 0.736 | 920 | 13.2862 | - | - |
496
+ | 0.768 | 960 | 13.032 | - | - |
497
+ | 0.8 | 1000 | 14.1847 | 13.6973 | 0.9700 |
498
+ | 0.832 | 1040 | 13.7911 | - | - |
499
+ | 0.864 | 1080 | 13.4031 | - | - |
500
+ | 0.896 | 1120 | 13.4924 | - | - |
501
+ | 0.928 | 1160 | 11.8654 | - | - |
502
+ | **0.96** | **1200** | **12.6416** | **13.4699** | **0.9744** |
503
+ | 0.992 | 1240 | 12.6136 | - | - |
504
+ | -1 | -1 | - | - | 0.9744 |
505
+
506
+ * The bold row denotes the saved checkpoint.
507
+
508
+ ### Environmental Impact
509
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
510
+ - **Energy Consumed**: 0.085 kWh
511
+ - **Carbon Emitted**: 0.033 kg of CO2
512
+ - **Hours Used**: 0.274 hours
513
+
514
+ ### Training Hardware
515
+ - **On Cloud**: No
516
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
517
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
518
+ - **RAM Size**: 31.78 GB
519
+
520
+ ### Framework Versions
521
+ - Python: 3.11.6
522
+ - Sentence Transformers: 5.0.0
523
+ - Transformers: 4.55.0.dev0
524
+ - PyTorch: 2.7.1+cu126
525
+ - Accelerate: 1.6.0
526
+ - Datasets: 3.6.0
527
+ - Tokenizers: 0.21.1
528
+
529
+ ## Citation
530
+
531
+ ### BibTeX
532
+
533
+ #### Sentence Transformers
534
+ ```bibtex
535
+ @inproceedings{reimers-2019-sentence-bert,
536
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
537
+ author = "Reimers, Nils and Gurevych, Iryna",
538
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
539
+ month = "11",
540
+ year = "2019",
541
+ publisher = "Association for Computational Linguistics",
542
+ url = "https://arxiv.org/abs/1908.10084",
543
+ }
544
+ ```
545
+
546
+ #### SpladeLoss
547
+ ```bibtex
548
+ @misc{formal2022distillationhardnegativesampling,
549
+ title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
550
+ author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
551
+ year={2022},
552
+ eprint={2205.04733},
553
+ archivePrefix={arXiv},
554
+ primaryClass={cs.IR},
555
+ url={https://arxiv.org/abs/2205.04733},
556
+ }
557
+ ```
558
+
559
+ #### SparseMarginMSELoss
560
+ ```bibtex
561
+ @misc{hofstätter2021improving,
562
+ title={Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation},
563
+ author={Sebastian Hofstätter and Sophia Althammer and Michael Schröder and Mete Sertkan and Allan Hanbury},
564
+ year={2021},
565
+ eprint={2010.02666},
566
+ archivePrefix={arXiv},
567
+ primaryClass={cs.IR}
568
+ }
569
+ ```
570
+
571
+ #### FlopsLoss
572
+ ```bibtex
573
+ @article{paria2020minimizing,
574
+ title={Minimizing flops to learn efficient sparse representations},
575
+ author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
576
+ journal={arXiv preprint arXiv:2004.05665},
577
+ year={2020}
578
+ }
579
+ ```
580
+
581
+ <!--
582
+ ## Glossary
583
+
584
+ *Clearly define terms in order to be accessible across audiences.*
585
+ -->
586
+
587
+ <!--
588
+ ## Model Card Authors
589
+
590
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
591
+ -->
592
+
593
+ <!--
594
+ ## Model Card Contact
595
+
596
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
597
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForMaskedLM"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.55.0.dev0",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SparseEncoder",
3
+ "__version__": {
4
+ "sentence_transformers": "5.0.0",
5
+ "transformers": "4.55.0.dev0",
6
+ "pytorch": "2.7.1+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "dot"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00830fc77c7706771bdc6eccaec28a4841e0c1d80be2cad87f43463c8d4b9833
3
+ size 438080896
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.sparse_encoder.models.MLMTransformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_SpladePooling",
12
+ "type": "sentence_transformers.sparse_encoder.models.SpladePooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff