Matjac5 commited on
Commit
765e39a
·
verified ·
1 Parent(s): a738d7a

Upload rag SentenceTransformer

Browse files
Files changed (3) hide show
  1. README.md +94 -57
  2. config.json +1 -1
  3. config_sentence_transformers.json +1 -1
README.md CHANGED
@@ -4,36 +4,57 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
- - dataset_size:286834
8
  - loss:MultipleNegativesRankingLoss
9
  base_model: Qwen/Qwen3-0.6B-Base
10
  widget:
11
- - source_sentence: What causes orbits?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  sentences:
13
- - p
14
- - I
15
- - C
16
- - source_sentence: Select the liquid.
 
 
17
  sentences:
 
 
18
  - '['
19
- - '['
20
- - '['
21
- - source_sentence: how many times digit 6 is used while writing numbers from 100 to
22
- 1100 ?
23
  sentences:
24
- - A
25
- - ''''
26
- - ''''
27
- - source_sentence: 'True about quinsy is:'
 
 
 
 
28
  sentences:
29
- - P
30
- - /
31
  - '['
32
- - source_sentence: Which is not the indication of CT in head trauma
 
33
  sentences:
34
- - A
35
  - S
36
- - '8'
 
37
  pipeline_tag: sentence-similarity
38
  library_name: sentence-transformers
39
  ---
@@ -87,9 +108,9 @@ from sentence_transformers import SentenceTransformer
87
  model = SentenceTransformer("sentence_transformers_model_id")
88
  # Run inference
89
  sentences = [
90
- 'Which is not the indication of CT in head trauma',
91
  'S',
92
- 'A',
93
  ]
94
  embeddings = model.encode(sentences)
95
  print(embeddings.shape)
@@ -143,19 +164,19 @@ You can finetune this model on your own dataset.
143
 
144
  #### Unnamed Dataset
145
 
146
- * Size: 286,834 training samples
147
  * Columns: <code>sentence_0</code> and <code>sentence_1</code>
148
  * Approximate statistics based on the first 1000 samples:
149
- | | sentence_0 | sentence_1 |
150
- |:--------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
151
- | type | string | string |
152
- | details | <ul><li>min: 3 tokens</li><li>mean: 32.29 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0 tokens</li><li>mean: 0.98 tokens</li><li>max: 1 tokens</li></ul> |
153
  * Samples:
154
- | sentence_0 | sentence_1 |
155
- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
156
- | <code>A, B and C rents a pasture for Rs.435. A put in 12 horses for 8 months, B 16 horses for 9 months and 18 horses for 6 months. How much should C pay?</code> | <code>[</code> |
157
- | <code>mr . kutty has only hens and sheep . if the total number of their heads is 38 and the total number of legs is 100 then what is the ratio between the numbers of hens and sheep ?</code> | <code> </code> |
158
- | <code>A fruit seller had some Mangoes. He sells 50% oranges and still has 500 Mangoes. How many Mangoes he had originally?</code> | <code>)</code> |
159
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
160
  ```json
161
  {
@@ -167,9 +188,9 @@ You can finetune this model on your own dataset.
167
  ### Training Hyperparameters
168
  #### Non-Default Hyperparameters
169
 
170
- - `per_device_train_batch_size`: 32
171
- - `per_device_eval_batch_size`: 32
172
- - `num_train_epochs`: 1
173
  - `fp16`: True
174
  - `multi_dataset_batch_sampler`: round_robin
175
 
@@ -180,8 +201,8 @@ You can finetune this model on your own dataset.
180
  - `do_predict`: False
181
  - `eval_strategy`: no
182
  - `prediction_loss_only`: True
183
- - `per_device_train_batch_size`: 32
184
- - `per_device_eval_batch_size`: 32
185
  - `per_gpu_train_batch_size`: None
186
  - `per_gpu_eval_batch_size`: None
187
  - `gradient_accumulation_steps`: 1
@@ -193,7 +214,7 @@ You can finetune this model on your own dataset.
193
  - `adam_beta2`: 0.999
194
  - `adam_epsilon`: 1e-08
195
  - `max_grad_norm`: 1
196
- - `num_train_epochs`: 1
197
  - `max_steps`: -1
198
  - `lr_scheduler_type`: linear
199
  - `lr_scheduler_kwargs`: {}
@@ -293,31 +314,47 @@ You can finetune this model on your own dataset.
293
  </details>
294
 
295
  ### Training Logs
296
- | Epoch | Step | Training Loss |
297
- |:------:|:----:|:-------------:|
298
- | 0.0558 | 500 | 3.4895 |
299
- | 0.1116 | 1000 | 3.1888 |
300
- | 0.1673 | 1500 | 3.2117 |
301
- | 0.2231 | 2000 | 0.0 |
302
- | 0.2789 | 2500 | 0.0 |
303
- | 0.3347 | 3000 | 0.0 |
304
- | 0.3905 | 3500 | 0.0 |
305
- | 0.4462 | 4000 | 0.0 |
306
- | 0.5020 | 4500 | 0.0 |
307
- | 0.5578 | 5000 | 0.0 |
308
- | 0.6136 | 5500 | 0.0 |
309
- | 0.6693 | 6000 | 0.0 |
310
- | 0.7251 | 6500 | 0.0 |
311
- | 0.7809 | 7000 | 0.0 |
312
- | 0.8367 | 7500 | 0.0 |
313
- | 0.8925 | 8000 | 0.0 |
314
- | 0.9482 | 8500 | 0.0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
315
 
316
 
317
  ### Framework Versions
318
  - Python: 3.11.13
319
  - Sentence Transformers: 4.1.0
320
- - Transformers: 4.52.3
321
  - PyTorch: 2.6.0+cu124
322
  - Accelerate: 1.7.0
323
  - Datasets: 3.6.0
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
+ - dataset_size:268861
8
  - loss:MultipleNegativesRankingLoss
9
  base_model: Qwen/Qwen3-0.6B-Base
10
  widget:
11
+ - source_sentence: 'There are seven thieves. They stole diamonds from a diamond merchant
12
+ and ran away. While running, night sets in and they decide to rest in the jungle.
13
+
14
+ When everybody was sleeping, two of them woke up and decided to divide the diamonds
15
+ equally among themselves. But when they divided the diamonds equally, one diamond
16
+ is left.
17
+
18
+ So they woke up the 3rd thief and tried to divide the diamonds equally again but
19
+ still one diamond was left. Then they woke up the 4th thief to divide the diamonds
20
+ equally again, and again one diamond was left. This happened with the 5th and
21
+ 6th thief – one diamond was still left.
22
+
23
+ Finally, they woke up the 7th thief and this time the diamonds were divided equally.
24
+
25
+ How many diamonds did they steal in total?'
26
  sentences:
27
+ - ''''
28
+ - ''''
29
+ - e
30
+ - source_sentence: 'praveen starts business with rs . 3220 and after 5 months , hari
31
+ joins with praveen as his partner . after a year , the profit is divided in the
32
+ ratio 2 : 3 . what is hari ’ s contribution in the capital ?'
33
  sentences:
34
+ - s
35
+ - '5'
36
  - '['
37
+ - source_sentence: 'Which of the following is material of choice in class V
38
+
39
+ cavity with abfraction?'
 
40
  sentences:
41
+ - '['
42
+ - t
43
+ - G
44
+ - source_sentence: A right circular cylinder has a height of 25 and a radius of 5.
45
+ A rectangular solid with a height of 15 and a square base, is placed in the cylinder
46
+ such that each of the corners of the solid is tangent to the cylinder wall. Liquid
47
+ is then poured into the cylinder such that it reaches the rim. What is the volume
48
+ of the liquid?
49
  sentences:
50
+ - '5'
 
51
  - '['
52
+ - '2'
53
+ - source_sentence: Cerebral angiography was performed by -
54
  sentences:
 
55
  - S
56
+ - t
57
+ - '2'
58
  pipeline_tag: sentence-similarity
59
  library_name: sentence-transformers
60
  ---
 
108
  model = SentenceTransformer("sentence_transformers_model_id")
109
  # Run inference
110
  sentences = [
111
+ 'Cerebral angiography was performed by -',
112
  'S',
113
+ '2',
114
  ]
115
  embeddings = model.encode(sentences)
116
  print(embeddings.shape)
 
164
 
165
  #### Unnamed Dataset
166
 
167
+ * Size: 268,861 training samples
168
  * Columns: <code>sentence_0</code> and <code>sentence_1</code>
169
  * Approximate statistics based on the first 1000 samples:
170
+ | | sentence_0 | sentence_1 |
171
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
172
+ | type | string | string |
173
+ | details | <ul><li>min: 5 tokens</li><li>mean: 48.3 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0 tokens</li><li>mean: 0.97 tokens</li><li>max: 1 tokens</li></ul> |
174
  * Samples:
175
+ | sentence_0 | sentence_1 |
176
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
177
+ | <code>A 1200 m long train crosses a tree in 120 sec, how much time will I take to pass a platform 1100 m long?</code> | <code>'</code> |
178
+ | <code>What is the opposite of rarefaction zones, where air molecules in waves are loosely packed?</code> | <code>[</code> |
179
+ | <code>if w is 40 percent less than e , e is 40 percent less than y , and z is 46 percent less than y , then z is greater than w by what percent of w ?</code> | <code>%</code> |
180
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
181
  ```json
182
  {
 
188
  ### Training Hyperparameters
189
  #### Non-Default Hyperparameters
190
 
191
+ - `per_device_train_batch_size`: 64
192
+ - `per_device_eval_batch_size`: 64
193
+ - `num_train_epochs`: 4
194
  - `fp16`: True
195
  - `multi_dataset_batch_sampler`: round_robin
196
 
 
201
  - `do_predict`: False
202
  - `eval_strategy`: no
203
  - `prediction_loss_only`: True
204
+ - `per_device_train_batch_size`: 64
205
+ - `per_device_eval_batch_size`: 64
206
  - `per_gpu_train_batch_size`: None
207
  - `per_gpu_eval_batch_size`: None
208
  - `gradient_accumulation_steps`: 1
 
214
  - `adam_beta2`: 0.999
215
  - `adam_epsilon`: 1e-08
216
  - `max_grad_norm`: 1
217
+ - `num_train_epochs`: 4
218
  - `max_steps`: -1
219
  - `lr_scheduler_type`: linear
220
  - `lr_scheduler_kwargs`: {}
 
314
  </details>
315
 
316
  ### Training Logs
317
+ | Epoch | Step | Training Loss |
318
+ |:------:|:-----:|:-------------:|
319
+ | 0.1190 | 500 | 4.0939 |
320
+ | 0.2380 | 1000 | 3.7716 |
321
+ | 0.3571 | 1500 | 0.0 |
322
+ | 0.4761 | 2000 | 0.0 |
323
+ | 0.5951 | 2500 | 0.0 |
324
+ | 0.7141 | 3000 | 0.0 |
325
+ | 0.8331 | 3500 | 0.0 |
326
+ | 0.9522 | 4000 | 0.0 |
327
+ | 1.0712 | 4500 | 0.0 |
328
+ | 1.1902 | 5000 | 0.0 |
329
+ | 1.3092 | 5500 | 0.0 |
330
+ | 1.4282 | 6000 | 0.0 |
331
+ | 1.5473 | 6500 | 0.0 |
332
+ | 1.6663 | 7000 | 0.0 |
333
+ | 1.7853 | 7500 | 0.0 |
334
+ | 1.9043 | 8000 | 0.0 |
335
+ | 2.0233 | 8500 | 0.0 |
336
+ | 2.1423 | 9000 | 0.0 |
337
+ | 2.2614 | 9500 | 0.0 |
338
+ | 2.3804 | 10000 | 0.0 |
339
+ | 2.4994 | 10500 | 0.0 |
340
+ | 2.6184 | 11000 | 0.0 |
341
+ | 2.7374 | 11500 | 0.0 |
342
+ | 2.8565 | 12000 | 0.0 |
343
+ | 2.9755 | 12500 | 0.0 |
344
+ | 3.0945 | 13000 | 0.0 |
345
+ | 3.2135 | 13500 | 0.0 |
346
+ | 3.3325 | 14000 | 0.0 |
347
+ | 3.4516 | 14500 | 0.0 |
348
+ | 3.5706 | 15000 | 0.0 |
349
+ | 3.6896 | 15500 | 0.0 |
350
+ | 3.8086 | 16000 | 0.0 |
351
+ | 3.9276 | 16500 | 0.0 |
352
 
353
 
354
  ### Framework Versions
355
  - Python: 3.11.13
356
  - Sentence Transformers: 4.1.0
357
+ - Transformers: 4.52.4
358
  - PyTorch: 2.6.0+cu124
359
  - Accelerate: 1.7.0
360
  - Datasets: 3.6.0
config.json CHANGED
@@ -23,7 +23,7 @@
23
  "sliding_window": null,
24
  "tie_word_embeddings": true,
25
  "torch_dtype": "float32",
26
- "transformers_version": "4.52.3",
27
  "use_cache": true,
28
  "use_sliding_window": false,
29
  "vocab_size": 151936
 
23
  "sliding_window": null,
24
  "tie_word_embeddings": true,
25
  "torch_dtype": "float32",
26
+ "transformers_version": "4.52.4",
27
  "use_cache": true,
28
  "use_sliding_window": false,
29
  "vocab_size": 151936
config_sentence_transformers.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
- "transformers": "4.52.3",
5
  "pytorch": "2.6.0+cu124"
6
  },
7
  "prompts": {},
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
+ "transformers": "4.52.4",
5
  "pytorch": "2.6.0+cu124"
6
  },
7
  "prompts": {},