TatvaRA commited on
Commit
58da7ec
·
verified ·
1 Parent(s): 002cf6e

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,843 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:73
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: thenlper/gte-base
14
+ widget:
15
+ - source_sentence: What is the maximum value of equipment that can be purchased with
16
+ a CUE Student Research Project Grant?
17
+ sentences:
18
+ - Equipment costs (valued up to $1000).
19
+ - Variable awards to recognize and reward academic achievement at the senior high
20
+ school level and to encourage students to pursue post -secondary studies.
21
+ - The Amazon Future Engineer Scholarship provides students with an opportunity to
22
+ upgrade their careers with a $7,500 CAD/year scholarship available for up to four
23
+ years.
24
+ - source_sentence: What is the minimum distance a recipient's hometown must be from
25
+ Concordia University of Edmonton to be eligible for the Alberta Blue Cross Away
26
+ from Home Scholarship?
27
+ sentences:
28
+ - Three awards are available
29
+ - The recipient’s hometown must be at least 100 kilometres from Concordia University
30
+ of Edmonton.
31
+ - 'Application Deadline: September 1'
32
+ - source_sentence: According to the selection criteria, what level of subjects are
33
+ used to determine the academic standing of a potential Alberta Blue Cross Away
34
+ from Home Scholarship recipient?
35
+ sentences:
36
+ - Selection is ba sed on the academic standing of 30 -level subjects used for admission.
37
+ - 'These eligible and ineligible lists are not exhaustive. Doubts about the eligibility
38
+ of expenses should be directed to the ORI’s Research Administration Service s
39
+ (RAS): [email protected] .'
40
+ - '*Value: $11000 Master’s; $14,000 Doctoral'
41
+ - source_sentence: According to the text, how many days does a grant recipient have
42
+ to submit a final report after the grant ends?
43
+ sentences:
44
+ - All Fall grant recipients are expected to submit an abstract to present an oral
45
+ and/or poster presentation of their work, either in its progression or final stage.
46
+ - a business program offered by an Alberta college, polytechnic, or university that
47
+ offers the prerequisite courses required for entrance into the CPA Professional
48
+ Education Program (CPA PEP).
49
+ - The applicant is required to complete and submit a final report within 5 days
50
+ of the end of the grant.
51
+ - source_sentence: In what format should applicants acknowledge the funding provided
52
+ by Concordia University of Edmonton for their Student Project Grant?
53
+ sentences:
54
+ - All oral or poster presentations, publications, including public messages, arising
55
+ from research supported by CUE grants must acknowledge the support of the institution.
56
+ Acknowledgement can be in the written format, such as " This research is funded
57
+ by the generous support of Concordia University of Edmonton through their CUE
58
+ Student Research Project Grants program ", or similar phrasing.
59
+ - This $1,000 scholarship is awarded to post -secondary students who have completed
60
+ at least one year towards their Bachelor of Science with a focus on Computer Science,
61
+ achieved an average GPA of 3.5 or higher, and are still enrolled in post -secondary
62
+ studie s.
63
+ - The recipient will be selected based on the highest grade in MARK320. In the event
64
+ of a tie, preference will be given to the student with the highest cumulative
65
+ GPA.
66
+ pipeline_tag: sentence-similarity
67
+ library_name: sentence-transformers
68
+ metrics:
69
+ - cosine_accuracy@1
70
+ - cosine_accuracy@3
71
+ - cosine_accuracy@5
72
+ - cosine_accuracy@10
73
+ - cosine_precision@1
74
+ - cosine_precision@3
75
+ - cosine_precision@5
76
+ - cosine_precision@10
77
+ - cosine_recall@1
78
+ - cosine_recall@3
79
+ - cosine_recall@5
80
+ - cosine_recall@10
81
+ - cosine_ndcg@10
82
+ - cosine_mrr@10
83
+ - cosine_map@100
84
+ model-index:
85
+ - name: BGE base Financial Matryoshka
86
+ results:
87
+ - task:
88
+ type: information-retrieval
89
+ name: Information Retrieval
90
+ dataset:
91
+ name: dim 768
92
+ type: dim_768
93
+ metrics:
94
+ - type: cosine_accuracy@1
95
+ value: 0.5555555555555556
96
+ name: Cosine Accuracy@1
97
+ - type: cosine_accuracy@3
98
+ value: 1.0
99
+ name: Cosine Accuracy@3
100
+ - type: cosine_accuracy@5
101
+ value: 1.0
102
+ name: Cosine Accuracy@5
103
+ - type: cosine_accuracy@10
104
+ value: 1.0
105
+ name: Cosine Accuracy@10
106
+ - type: cosine_precision@1
107
+ value: 0.5555555555555556
108
+ name: Cosine Precision@1
109
+ - type: cosine_precision@3
110
+ value: 0.3333333333333333
111
+ name: Cosine Precision@3
112
+ - type: cosine_precision@5
113
+ value: 0.2
114
+ name: Cosine Precision@5
115
+ - type: cosine_precision@10
116
+ value: 0.1
117
+ name: Cosine Precision@10
118
+ - type: cosine_recall@1
119
+ value: 0.5555555555555556
120
+ name: Cosine Recall@1
121
+ - type: cosine_recall@3
122
+ value: 1.0
123
+ name: Cosine Recall@3
124
+ - type: cosine_recall@5
125
+ value: 1.0
126
+ name: Cosine Recall@5
127
+ - type: cosine_recall@10
128
+ value: 1.0
129
+ name: Cosine Recall@10
130
+ - type: cosine_ndcg@10
131
+ value: 0.8214210289682637
132
+ name: Cosine Ndcg@10
133
+ - type: cosine_mrr@10
134
+ value: 0.7592592592592592
135
+ name: Cosine Mrr@10
136
+ - type: cosine_map@100
137
+ value: 0.7592592592592592
138
+ name: Cosine Map@100
139
+ - task:
140
+ type: information-retrieval
141
+ name: Information Retrieval
142
+ dataset:
143
+ name: dim 512
144
+ type: dim_512
145
+ metrics:
146
+ - type: cosine_accuracy@1
147
+ value: 0.4444444444444444
148
+ name: Cosine Accuracy@1
149
+ - type: cosine_accuracy@3
150
+ value: 0.8888888888888888
151
+ name: Cosine Accuracy@3
152
+ - type: cosine_accuracy@5
153
+ value: 1.0
154
+ name: Cosine Accuracy@5
155
+ - type: cosine_accuracy@10
156
+ value: 1.0
157
+ name: Cosine Accuracy@10
158
+ - type: cosine_precision@1
159
+ value: 0.4444444444444444
160
+ name: Cosine Precision@1
161
+ - type: cosine_precision@3
162
+ value: 0.2962962962962963
163
+ name: Cosine Precision@3
164
+ - type: cosine_precision@5
165
+ value: 0.2
166
+ name: Cosine Precision@5
167
+ - type: cosine_precision@10
168
+ value: 0.1
169
+ name: Cosine Precision@10
170
+ - type: cosine_recall@1
171
+ value: 0.4444444444444444
172
+ name: Cosine Recall@1
173
+ - type: cosine_recall@3
174
+ value: 0.8888888888888888
175
+ name: Cosine Recall@3
176
+ - type: cosine_recall@5
177
+ value: 1.0
178
+ name: Cosine Recall@5
179
+ - type: cosine_recall@10
180
+ value: 1.0
181
+ name: Cosine Recall@10
182
+ - type: cosine_ndcg@10
183
+ value: 0.7678413135022636
184
+ name: Cosine Ndcg@10
185
+ - type: cosine_mrr@10
186
+ value: 0.6888888888888889
187
+ name: Cosine Mrr@10
188
+ - type: cosine_map@100
189
+ value: 0.6888888888888889
190
+ name: Cosine Map@100
191
+ - task:
192
+ type: information-retrieval
193
+ name: Information Retrieval
194
+ dataset:
195
+ name: dim 256
196
+ type: dim_256
197
+ metrics:
198
+ - type: cosine_accuracy@1
199
+ value: 0.4444444444444444
200
+ name: Cosine Accuracy@1
201
+ - type: cosine_accuracy@3
202
+ value: 1.0
203
+ name: Cosine Accuracy@3
204
+ - type: cosine_accuracy@5
205
+ value: 1.0
206
+ name: Cosine Accuracy@5
207
+ - type: cosine_accuracy@10
208
+ value: 1.0
209
+ name: Cosine Accuracy@10
210
+ - type: cosine_precision@1
211
+ value: 0.4444444444444444
212
+ name: Cosine Precision@1
213
+ - type: cosine_precision@3
214
+ value: 0.3333333333333333
215
+ name: Cosine Precision@3
216
+ - type: cosine_precision@5
217
+ value: 0.2
218
+ name: Cosine Precision@5
219
+ - type: cosine_precision@10
220
+ value: 0.1
221
+ name: Cosine Precision@10
222
+ - type: cosine_recall@1
223
+ value: 0.4444444444444444
224
+ name: Cosine Recall@1
225
+ - type: cosine_recall@3
226
+ value: 1.0
227
+ name: Cosine Recall@3
228
+ - type: cosine_recall@5
229
+ value: 1.0
230
+ name: Cosine Recall@5
231
+ - type: cosine_recall@10
232
+ value: 1.0
233
+ name: Cosine Recall@10
234
+ - type: cosine_ndcg@10
235
+ value: 0.7658654734127082
236
+ name: Cosine Ndcg@10
237
+ - type: cosine_mrr@10
238
+ value: 0.6851851851851851
239
+ name: Cosine Mrr@10
240
+ - type: cosine_map@100
241
+ value: 0.6851851851851851
242
+ name: Cosine Map@100
243
+ - task:
244
+ type: information-retrieval
245
+ name: Information Retrieval
246
+ dataset:
247
+ name: dim 128
248
+ type: dim_128
249
+ metrics:
250
+ - type: cosine_accuracy@1
251
+ value: 0.4444444444444444
252
+ name: Cosine Accuracy@1
253
+ - type: cosine_accuracy@3
254
+ value: 0.8888888888888888
255
+ name: Cosine Accuracy@3
256
+ - type: cosine_accuracy@5
257
+ value: 0.8888888888888888
258
+ name: Cosine Accuracy@5
259
+ - type: cosine_accuracy@10
260
+ value: 0.8888888888888888
261
+ name: Cosine Accuracy@10
262
+ - type: cosine_precision@1
263
+ value: 0.4444444444444444
264
+ name: Cosine Precision@1
265
+ - type: cosine_precision@3
266
+ value: 0.2962962962962963
267
+ name: Cosine Precision@3
268
+ - type: cosine_precision@5
269
+ value: 0.17777777777777778
270
+ name: Cosine Precision@5
271
+ - type: cosine_precision@10
272
+ value: 0.08888888888888889
273
+ name: Cosine Precision@10
274
+ - type: cosine_recall@1
275
+ value: 0.4444444444444444
276
+ name: Cosine Recall@1
277
+ - type: cosine_recall@3
278
+ value: 0.8888888888888888
279
+ name: Cosine Recall@3
280
+ - type: cosine_recall@5
281
+ value: 0.8888888888888888
282
+ name: Cosine Recall@5
283
+ - type: cosine_recall@10
284
+ value: 0.8888888888888888
285
+ name: Cosine Recall@10
286
+ - type: cosine_ndcg@10
287
+ value: 0.7103099178571526
288
+ name: Cosine Ndcg@10
289
+ - type: cosine_mrr@10
290
+ value: 0.6481481481481483
291
+ name: Cosine Mrr@10
292
+ - type: cosine_map@100
293
+ value: 0.6521164021164021
294
+ name: Cosine Map@100
295
+ - task:
296
+ type: information-retrieval
297
+ name: Information Retrieval
298
+ dataset:
299
+ name: dim 64
300
+ type: dim_64
301
+ metrics:
302
+ - type: cosine_accuracy@1
303
+ value: 0.6666666666666666
304
+ name: Cosine Accuracy@1
305
+ - type: cosine_accuracy@3
306
+ value: 0.6666666666666666
307
+ name: Cosine Accuracy@3
308
+ - type: cosine_accuracy@5
309
+ value: 0.7777777777777778
310
+ name: Cosine Accuracy@5
311
+ - type: cosine_accuracy@10
312
+ value: 0.8888888888888888
313
+ name: Cosine Accuracy@10
314
+ - type: cosine_precision@1
315
+ value: 0.6666666666666666
316
+ name: Cosine Precision@1
317
+ - type: cosine_precision@3
318
+ value: 0.2222222222222222
319
+ name: Cosine Precision@3
320
+ - type: cosine_precision@5
321
+ value: 0.15555555555555556
322
+ name: Cosine Precision@5
323
+ - type: cosine_precision@10
324
+ value: 0.08888888888888889
325
+ name: Cosine Precision@10
326
+ - type: cosine_recall@1
327
+ value: 0.6666666666666666
328
+ name: Cosine Recall@1
329
+ - type: cosine_recall@3
330
+ value: 0.6666666666666666
331
+ name: Cosine Recall@3
332
+ - type: cosine_recall@5
333
+ value: 0.7777777777777778
334
+ name: Cosine Recall@5
335
+ - type: cosine_recall@10
336
+ value: 0.8888888888888888
337
+ name: Cosine Recall@10
338
+ - type: cosine_ndcg@10
339
+ value: 0.7515566546007473
340
+ name: Cosine Ndcg@10
341
+ - type: cosine_mrr@10
342
+ value: 0.7103174603174602
343
+ name: Cosine Mrr@10
344
+ - type: cosine_map@100
345
+ value: 0.71494708994709
346
+ name: Cosine Map@100
347
+ ---
348
+
349
+ # BGE base Financial Matryoshka
350
+
351
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-base](https://huggingface.co/thenlper/gte-base) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
352
+
353
+ ## Model Details
354
+
355
+ ### Model Description
356
+ - **Model Type:** Sentence Transformer
357
+ - **Base model:** [thenlper/gte-base](https://huggingface.co/thenlper/gte-base) <!-- at revision c078288308d8dee004ab72c6191778064285ec0c -->
358
+ - **Maximum Sequence Length:** 512 tokens
359
+ - **Output Dimensionality:** 768 dimensions
360
+ - **Similarity Function:** Cosine Similarity
361
+ - **Training Dataset:**
362
+ - json
363
+ - **Language:** en
364
+ - **License:** apache-2.0
365
+
366
+ ### Model Sources
367
+
368
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
369
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
370
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
371
+
372
+ ### Full Model Architecture
373
+
374
+ ```
375
+ SentenceTransformer(
376
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
377
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
378
+ (2): Normalize()
379
+ )
380
+ ```
381
+
382
+ ## Usage
383
+
384
+ ### Direct Usage (Sentence Transformers)
385
+
386
+ First install the Sentence Transformers library:
387
+
388
+ ```bash
389
+ pip install -U sentence-transformers
390
+ ```
391
+
392
+ Then you can load this model and run inference.
393
+ ```python
394
+ from sentence_transformers import SentenceTransformer
395
+
396
+ # Download from the 🤗 Hub
397
+ model = SentenceTransformer("TatvaRA/gte-base-finetuned-schorlaships-matryonshka")
398
+ # Run inference
399
+ sentences = [
400
+ 'In what format should applicants acknowledge the funding provided by Concordia University of Edmonton for their Student Project Grant?',
401
+ 'All oral or poster presentations, publications, including public messages, arising from research supported by CUE grants must acknowledge the support of the institution. Acknowledgement can be in the written format, such as " This research is funded by the generous support of Concordia University of Edmonton through their CUE Student Research Project Grants program ", or similar phrasing.',
402
+ 'The recipient will be selected based on the highest grade in MARK320. In the event of a tie, preference will be given to the student with the highest cumulative GPA.',
403
+ ]
404
+ embeddings = model.encode(sentences)
405
+ print(embeddings.shape)
406
+ # [3, 768]
407
+
408
+ # Get the similarity scores for the embeddings
409
+ similarities = model.similarity(embeddings, embeddings)
410
+ print(similarities.shape)
411
+ # [3, 3]
412
+ ```
413
+
414
+ <!--
415
+ ### Direct Usage (Transformers)
416
+
417
+ <details><summary>Click to see the direct usage in Transformers</summary>
418
+
419
+ </details>
420
+ -->
421
+
422
+ <!--
423
+ ### Downstream Usage (Sentence Transformers)
424
+
425
+ You can finetune this model on your own dataset.
426
+
427
+ <details><summary>Click to expand</summary>
428
+
429
+ </details>
430
+ -->
431
+
432
+ <!--
433
+ ### Out-of-Scope Use
434
+
435
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
436
+ -->
437
+
438
+ ## Evaluation
439
+
440
+ ### Metrics
441
+
442
+ #### Information Retrieval
443
+
444
+ * Dataset: `dim_768`
445
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
446
+ ```json
447
+ {
448
+ "truncate_dim": 768
449
+ }
450
+ ```
451
+
452
+ | Metric | Value |
453
+ |:--------------------|:-----------|
454
+ | cosine_accuracy@1 | 0.5556 |
455
+ | cosine_accuracy@3 | 1.0 |
456
+ | cosine_accuracy@5 | 1.0 |
457
+ | cosine_accuracy@10 | 1.0 |
458
+ | cosine_precision@1 | 0.5556 |
459
+ | cosine_precision@3 | 0.3333 |
460
+ | cosine_precision@5 | 0.2 |
461
+ | cosine_precision@10 | 0.1 |
462
+ | cosine_recall@1 | 0.5556 |
463
+ | cosine_recall@3 | 1.0 |
464
+ | cosine_recall@5 | 1.0 |
465
+ | cosine_recall@10 | 1.0 |
466
+ | **cosine_ndcg@10** | **0.8214** |
467
+ | cosine_mrr@10 | 0.7593 |
468
+ | cosine_map@100 | 0.7593 |
469
+
470
+ #### Information Retrieval
471
+
472
+ * Dataset: `dim_512`
473
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
474
+ ```json
475
+ {
476
+ "truncate_dim": 512
477
+ }
478
+ ```
479
+
480
+ | Metric | Value |
481
+ |:--------------------|:-----------|
482
+ | cosine_accuracy@1 | 0.4444 |
483
+ | cosine_accuracy@3 | 0.8889 |
484
+ | cosine_accuracy@5 | 1.0 |
485
+ | cosine_accuracy@10 | 1.0 |
486
+ | cosine_precision@1 | 0.4444 |
487
+ | cosine_precision@3 | 0.2963 |
488
+ | cosine_precision@5 | 0.2 |
489
+ | cosine_precision@10 | 0.1 |
490
+ | cosine_recall@1 | 0.4444 |
491
+ | cosine_recall@3 | 0.8889 |
492
+ | cosine_recall@5 | 1.0 |
493
+ | cosine_recall@10 | 1.0 |
494
+ | **cosine_ndcg@10** | **0.7678** |
495
+ | cosine_mrr@10 | 0.6889 |
496
+ | cosine_map@100 | 0.6889 |
497
+
498
+ #### Information Retrieval
499
+
500
+ * Dataset: `dim_256`
501
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
502
+ ```json
503
+ {
504
+ "truncate_dim": 256
505
+ }
506
+ ```
507
+
508
+ | Metric | Value |
509
+ |:--------------------|:-----------|
510
+ | cosine_accuracy@1 | 0.4444 |
511
+ | cosine_accuracy@3 | 1.0 |
512
+ | cosine_accuracy@5 | 1.0 |
513
+ | cosine_accuracy@10 | 1.0 |
514
+ | cosine_precision@1 | 0.4444 |
515
+ | cosine_precision@3 | 0.3333 |
516
+ | cosine_precision@5 | 0.2 |
517
+ | cosine_precision@10 | 0.1 |
518
+ | cosine_recall@1 | 0.4444 |
519
+ | cosine_recall@3 | 1.0 |
520
+ | cosine_recall@5 | 1.0 |
521
+ | cosine_recall@10 | 1.0 |
522
+ | **cosine_ndcg@10** | **0.7659** |
523
+ | cosine_mrr@10 | 0.6852 |
524
+ | cosine_map@100 | 0.6852 |
525
+
526
+ #### Information Retrieval
527
+
528
+ * Dataset: `dim_128`
529
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
530
+ ```json
531
+ {
532
+ "truncate_dim": 128
533
+ }
534
+ ```
535
+
536
+ | Metric | Value |
537
+ |:--------------------|:-----------|
538
+ | cosine_accuracy@1 | 0.4444 |
539
+ | cosine_accuracy@3 | 0.8889 |
540
+ | cosine_accuracy@5 | 0.8889 |
541
+ | cosine_accuracy@10 | 0.8889 |
542
+ | cosine_precision@1 | 0.4444 |
543
+ | cosine_precision@3 | 0.2963 |
544
+ | cosine_precision@5 | 0.1778 |
545
+ | cosine_precision@10 | 0.0889 |
546
+ | cosine_recall@1 | 0.4444 |
547
+ | cosine_recall@3 | 0.8889 |
548
+ | cosine_recall@5 | 0.8889 |
549
+ | cosine_recall@10 | 0.8889 |
550
+ | **cosine_ndcg@10** | **0.7103** |
551
+ | cosine_mrr@10 | 0.6481 |
552
+ | cosine_map@100 | 0.6521 |
553
+
554
+ #### Information Retrieval
555
+
556
+ * Dataset: `dim_64`
557
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
558
+ ```json
559
+ {
560
+ "truncate_dim": 64
561
+ }
562
+ ```
563
+
564
+ | Metric | Value |
565
+ |:--------------------|:-----------|
566
+ | cosine_accuracy@1 | 0.6667 |
567
+ | cosine_accuracy@3 | 0.6667 |
568
+ | cosine_accuracy@5 | 0.7778 |
569
+ | cosine_accuracy@10 | 0.8889 |
570
+ | cosine_precision@1 | 0.6667 |
571
+ | cosine_precision@3 | 0.2222 |
572
+ | cosine_precision@5 | 0.1556 |
573
+ | cosine_precision@10 | 0.0889 |
574
+ | cosine_recall@1 | 0.6667 |
575
+ | cosine_recall@3 | 0.6667 |
576
+ | cosine_recall@5 | 0.7778 |
577
+ | cosine_recall@10 | 0.8889 |
578
+ | **cosine_ndcg@10** | **0.7516** |
579
+ | cosine_mrr@10 | 0.7103 |
580
+ | cosine_map@100 | 0.7149 |
581
+
582
+ <!--
583
+ ## Bias, Risks and Limitations
584
+
585
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
586
+ -->
587
+
588
+ <!--
589
+ ### Recommendations
590
+
591
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
592
+ -->
593
+
594
+ ## Training Details
595
+
596
+ ### Training Dataset
597
+
598
+ #### json
599
+
600
+ * Dataset: json
601
+ * Size: 73 training samples
602
+ * Columns: <code>anchor</code> and <code>positive</code>
603
+ * Approximate statistics based on the first 73 samples:
604
+ | | anchor | positive |
605
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
606
+ | type | string | string |
607
+ | details | <ul><li>min: 14 tokens</li><li>mean: 23.0 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 32.74 tokens</li><li>max: 346 tokens</li></ul> |
608
+ * Samples:
609
+ | anchor | positive |
610
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
611
+ | <code>What specific type of students are the Alberta Innovates Graduate Student Scholarships designed to support?</code> | <code>The Alberta Innovates Graduate Student Scholarships support academically superior graduate students <br>who are receiving training and conducting research in areas that are strategically important to Alberta’s <br>economy.</code> |
612
+ | <code>What is the specific date by which students must submit their reports for the Spring 2025 grant period?</code> | <code>Report due date April 20th (5 days post grant closure)</code> |
613
+ | <code>In what format should applicants acknowledge the funding provided by Concordia University of Edmonton for their Student Project Grant?</code> | <code>All oral or poster presentations, publications, including public messages, arising from research supported by CUE grants must acknowledge the support of the institution. Acknowledgement can be in the written format, such as " This research is funded by the generous support of Concordia University of Edmonton through their CUE Student Research Project Grants program ", or similar phrasing.</code> |
614
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
615
+ ```json
616
+ {
617
+ "loss": "MultipleNegativesRankingLoss",
618
+ "matryoshka_dims": [
619
+ 768,
620
+ 512,
621
+ 256,
622
+ 128,
623
+ 64
624
+ ],
625
+ "matryoshka_weights": [
626
+ 1,
627
+ 1,
628
+ 1,
629
+ 1,
630
+ 1
631
+ ],
632
+ "n_dims_per_step": -1
633
+ }
634
+ ```
635
+
636
+ ### Training Hyperparameters
637
+ #### Non-Default Hyperparameters
638
+
639
+ - `eval_strategy`: epoch
640
+ - `per_device_train_batch_size`: 32
641
+ - `per_device_eval_batch_size`: 16
642
+ - `gradient_accumulation_steps`: 16
643
+ - `learning_rate`: 2e-05
644
+ - `num_train_epochs`: 4
645
+ - `lr_scheduler_type`: cosine
646
+ - `warmup_ratio`: 0.1
647
+ - `fp16`: True
648
+ - `load_best_model_at_end`: True
649
+ - `optim`: adamw_torch_fused
650
+ - `batch_sampler`: no_duplicates
651
+
652
+ #### All Hyperparameters
653
+ <details><summary>Click to expand</summary>
654
+
655
+ - `overwrite_output_dir`: False
656
+ - `do_predict`: False
657
+ - `eval_strategy`: epoch
658
+ - `prediction_loss_only`: True
659
+ - `per_device_train_batch_size`: 32
660
+ - `per_device_eval_batch_size`: 16
661
+ - `per_gpu_train_batch_size`: None
662
+ - `per_gpu_eval_batch_size`: None
663
+ - `gradient_accumulation_steps`: 16
664
+ - `eval_accumulation_steps`: None
665
+ - `learning_rate`: 2e-05
666
+ - `weight_decay`: 0.0
667
+ - `adam_beta1`: 0.9
668
+ - `adam_beta2`: 0.999
669
+ - `adam_epsilon`: 1e-08
670
+ - `max_grad_norm`: 1.0
671
+ - `num_train_epochs`: 4
672
+ - `max_steps`: -1
673
+ - `lr_scheduler_type`: cosine
674
+ - `lr_scheduler_kwargs`: {}
675
+ - `warmup_ratio`: 0.1
676
+ - `warmup_steps`: 0
677
+ - `log_level`: passive
678
+ - `log_level_replica`: warning
679
+ - `log_on_each_node`: True
680
+ - `logging_nan_inf_filter`: True
681
+ - `save_safetensors`: True
682
+ - `save_on_each_node`: False
683
+ - `save_only_model`: False
684
+ - `restore_callback_states_from_checkpoint`: False
685
+ - `no_cuda`: False
686
+ - `use_cpu`: False
687
+ - `use_mps_device`: False
688
+ - `seed`: 42
689
+ - `data_seed`: None
690
+ - `jit_mode_eval`: False
691
+ - `use_ipex`: False
692
+ - `bf16`: False
693
+ - `fp16`: True
694
+ - `fp16_opt_level`: O1
695
+ - `half_precision_backend`: auto
696
+ - `bf16_full_eval`: False
697
+ - `fp16_full_eval`: False
698
+ - `tf32`: None
699
+ - `local_rank`: 0
700
+ - `ddp_backend`: None
701
+ - `tpu_num_cores`: None
702
+ - `tpu_metrics_debug`: False
703
+ - `debug`: []
704
+ - `dataloader_drop_last`: False
705
+ - `dataloader_num_workers`: 0
706
+ - `dataloader_prefetch_factor`: None
707
+ - `past_index`: -1
708
+ - `disable_tqdm`: False
709
+ - `remove_unused_columns`: True
710
+ - `label_names`: None
711
+ - `load_best_model_at_end`: True
712
+ - `ignore_data_skip`: False
713
+ - `fsdp`: []
714
+ - `fsdp_min_num_params`: 0
715
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
716
+ - `fsdp_transformer_layer_cls_to_wrap`: None
717
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
718
+ - `deepspeed`: None
719
+ - `label_smoothing_factor`: 0.0
720
+ - `optim`: adamw_torch_fused
721
+ - `optim_args`: None
722
+ - `adafactor`: False
723
+ - `group_by_length`: False
724
+ - `length_column_name`: length
725
+ - `ddp_find_unused_parameters`: None
726
+ - `ddp_bucket_cap_mb`: None
727
+ - `ddp_broadcast_buffers`: False
728
+ - `dataloader_pin_memory`: True
729
+ - `dataloader_persistent_workers`: False
730
+ - `skip_memory_metrics`: True
731
+ - `use_legacy_prediction_loop`: False
732
+ - `push_to_hub`: False
733
+ - `resume_from_checkpoint`: None
734
+ - `hub_model_id`: None
735
+ - `hub_strategy`: every_save
736
+ - `hub_private_repo`: False
737
+ - `hub_always_push`: False
738
+ - `gradient_checkpointing`: False
739
+ - `gradient_checkpointing_kwargs`: None
740
+ - `include_inputs_for_metrics`: False
741
+ - `eval_do_concat_batches`: True
742
+ - `fp16_backend`: auto
743
+ - `push_to_hub_model_id`: None
744
+ - `push_to_hub_organization`: None
745
+ - `mp_parameters`:
746
+ - `auto_find_batch_size`: False
747
+ - `full_determinism`: False
748
+ - `torchdynamo`: None
749
+ - `ray_scope`: last
750
+ - `ddp_timeout`: 1800
751
+ - `torch_compile`: False
752
+ - `torch_compile_backend`: None
753
+ - `torch_compile_mode`: None
754
+ - `dispatch_batches`: None
755
+ - `split_batches`: None
756
+ - `include_tokens_per_second`: False
757
+ - `include_num_input_tokens_seen`: False
758
+ - `neftune_noise_alpha`: None
759
+ - `optim_target_modules`: None
760
+ - `batch_eval_metrics`: False
761
+ - `prompts`: None
762
+ - `batch_sampler`: no_duplicates
763
+ - `multi_dataset_batch_sampler`: proportional
764
+
765
+ </details>
766
+
767
+ ### Training Logs
768
+ | Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
769
+ |:-------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
770
+ | 1.0 | 1 | 0.7249 | 0.7249 | 0.7473 | 0.7026 | 0.6686 |
771
+ | 2.0 | 2 | 0.7619 | 0.7249 | 0.7533 | 0.7026 | 0.7480 |
772
+ | **3.0** | **3** | **0.7804** | **0.7619** | **0.7659** | **0.7103** | **0.7496** |
773
+ | 4.0 | 4 | 0.8214 | 0.7678 | 0.7659 | 0.7103 | 0.7516 |
774
+
775
+ * The bold row denotes the saved checkpoint.
776
+
777
+ ### Framework Versions
778
+ - Python: 3.11.12
779
+ - Sentence Transformers: 4.1.0
780
+ - Transformers: 4.41.2
781
+ - PyTorch: 2.1.2+cu121
782
+ - Accelerate: 1.5.2
783
+ - Datasets: 2.19.1
784
+ - Tokenizers: 0.19.1
785
+
786
+ ## Citation
787
+
788
+ ### BibTeX
789
+
790
+ #### Sentence Transformers
791
+ ```bibtex
792
+ @inproceedings{reimers-2019-sentence-bert,
793
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
794
+ author = "Reimers, Nils and Gurevych, Iryna",
795
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
796
+ month = "11",
797
+ year = "2019",
798
+ publisher = "Association for Computational Linguistics",
799
+ url = "https://arxiv.org/abs/1908.10084",
800
+ }
801
+ ```
802
+
803
+ #### MatryoshkaLoss
804
+ ```bibtex
805
+ @misc{kusupati2024matryoshka,
806
+ title={Matryoshka Representation Learning},
807
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
808
+ year={2024},
809
+ eprint={2205.13147},
810
+ archivePrefix={arXiv},
811
+ primaryClass={cs.LG}
812
+ }
813
+ ```
814
+
815
+ #### MultipleNegativesRankingLoss
816
+ ```bibtex
817
+ @misc{henderson2017efficient,
818
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
819
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
820
+ year={2017},
821
+ eprint={1705.00652},
822
+ archivePrefix={arXiv},
823
+ primaryClass={cs.CL}
824
+ }
825
+ ```
826
+
827
+ <!--
828
+ ## Glossary
829
+
830
+ *Clearly define terms in order to be accessible across audiences.*
831
+ -->
832
+
833
+ <!--
834
+ ## Model Card Authors
835
+
836
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
837
+ -->
838
+
839
+ <!--
840
+ ## Model Card Contact
841
+
842
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
843
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "thenlper/gte-base",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a7ad835eda1ebecdf38d2d5a1e53b801cc477f069351ab937ca61ad67a5993c
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "max_length": 128,
49
+ "model_max_length": 512,
50
+ "pad_to_multiple_of": null,
51
+ "pad_token": "[PAD]",
52
+ "pad_token_type_id": 0,
53
+ "padding_side": "right",
54
+ "sep_token": "[SEP]",
55
+ "stride": 0,
56
+ "strip_accents": null,
57
+ "tokenize_chinese_chars": true,
58
+ "tokenizer_class": "BertTokenizer",
59
+ "truncation_side": "right",
60
+ "truncation_strategy": "longest_first",
61
+ "unk_token": "[UNK]"
62
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff