Gurveer05 commited on
Commit
50bb180
·
verified ·
1 Parent(s): e137031

Add new SentenceTransformer model.

Browse files
README.md CHANGED
@@ -7,54 +7,50 @@ tags:
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
- - dataset_size:2940
11
  - loss:MultipleNegativesSymmetricRankingLoss
12
  widget:
13
- - source_sentence: Enlarge a shape, with a centre of enlargement given, by a positive
14
- scale factor bigger than 1, where the centre of enlargement lies on the edge or
15
- outside of the object The triangle is enlarged by scale factor 3, with the centre
16
- of enlargement at (1,0). What are the new coordinates of the point marked T ?
17
- ![A coordinate grid with the x-axis going from -1 to 10 and the y-axis going from
18
- -1 to 7. 3 points are plotted and joined with straight lines to form a triangle.
19
- The points are (1,1), (1,4) and (3,1). Point (3,1) is labelled as T. Point (1,0)
20
- is also plotted.]() (9,3)
21
  sentences:
22
- - Confuses powers and multiples
23
- - Enlarges by the wrong centre of enlargement
24
- - When asked for factors of an algebraic expression, thinks any part of a term will
25
- be a factor
26
- - source_sentence: 'Identify a right-angled triangle from a description of the properties
27
- A triangle has the following angles: 90^, 45^, 45^ Statement 1. It must be a right
28
- angled triangle Statement 2. It must be an isosceles triangle Which is true? Statement
29
- 1'
30
  sentences:
31
- - When solving a problem using written division (bus-stop method), does the calculation
32
- from right to left
33
- - Thinks finding a fraction of an amount means subtracting from that amount
34
- - Believes isosceles triangles cannot have right angles
35
- - source_sentence: Convert from kilometers to miles 1 km≈ 0.6 miles 4 km≈□ miles 0.24
 
 
 
36
  sentences:
37
- - Believes multiplying two negatives gives a negative answer
38
- - Believes two lines of the same length are parallel
39
- - When multiplying decimals, ignores place value and just multiplies the digits
40
- - source_sentence: Identify the order of rotational symmetry of a shape Which shape
41
- has rotational symmetry order 4 ? ![Trapezium]()
 
 
 
 
42
  sentences:
43
- - Believes the whole and remainder are the other way when changing an improper fraction
44
- to a mixed number
45
- - Does not know how to find order of rotational symmetry
46
- - Fails to reflect across mirror line
47
- - source_sentence: Identify whether two shapes are similar or not Tom and Katie are
48
- discussing similarity. Who is correct? Tom says these two rectangles are similar
49
- ![Two rectangles of different sizes. One rectangle has width 2cm and height 3cm.
50
- The other rectangle has width 4cm and height 9cm. ]() Katie says these two rectangles
51
- are similar ![Two rectangles of different sizes. One rectangle has width 4cm and
52
- height 6cm. The other rectangle has width 7cm and height 9cm. ]() Only Katie
53
  sentences:
54
- - Does not recognise when one part of a fraction is the negative of the other
55
- - When solving simultaneous equations, thinks they can't multiply each equation
56
- by a different number
57
- - Thinks adding the same value to each side makes shapes similar
58
  ---
59
 
60
  # SentenceTransformer based on BAAI/bge-large-en-v1.5
@@ -108,9 +104,9 @@ from sentence_transformers import SentenceTransformer
108
  model = SentenceTransformer("Gurveer05/bge-large-eedi-2024")
109
  # Run inference
110
  sentences = [
111
- 'Identify whether two shapes are similar or not Tom and Katie are discussing similarity. Who is correct? Tom says these two rectangles are similar ![Two rectangles of different sizes. One rectangle has width 2cm and height 3cm. The other rectangle has width 4cm and height 9cm. ]() Katie says these two rectangles are similar ![Two rectangles of different sizes. One rectangle has width 4cm and height 6cm. The other rectangle has width 7cm and height 9cm. ]() Only Katie',
112
- 'Thinks adding the same value to each side makes shapes similar',
113
- "When solving simultaneous equations, thinks they can't multiply each equation by a different number",
114
  ]
115
  embeddings = model.encode(sentences)
116
  print(embeddings.shape)
@@ -165,19 +161,19 @@ You can finetune this model on your own dataset.
165
  #### csv
166
 
167
  * Dataset: csv
168
- * Size: 2,940 training samples
169
  * Columns: <code>sentence1</code> and <code>sentence2</code>
170
  * Approximate statistics based on the first 1000 samples:
171
  | | sentence1 | sentence2 |
172
  |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
173
  | type | string | string |
174
- | details | <ul><li>min: 13 tokens</li><li>mean: 56.03 tokens</li><li>max: 249 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.19 tokens</li><li>max: 39 tokens</li></ul> |
175
  * Samples:
176
- | sentence1 | sentence2 |
177
- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
178
- | <code>Read a fraction on a scale where the required number is marked by a dash between two numbers What fraction is the arrow pointing to? ![An image of a numberline with 5 dashes. On the leftmost dash is the number 1/6. On the rightmost dash is the number 3/6. An arrow points to the 4th dash from the left]() 3/4</code> | <code>When reading a dash on a number line does not take into account the number at the start or the width of each division</code> |
179
- | <code>Substitute positive non-integer values into expressions involving powers or roots Jo and Paul are discussing quadratic equations. Jo says there is no value of x that can make (1-x)^2 negative. Paul says there is no value of x that can make 1-x^2 positive. Who is correct? Both Jo and Paul</code> | <code>Assumes a fact without considering enough examples</code> |
180
- | <code>Recognise and use efficient methods for mental multiplication Tom and Katie are discussing mental multiplication strategies. Tom says 15 × 42=154 × 2 Katie says 15 × 42=(15 × 4)+(15 × 2) Who is correct? Only Tom</code> | <code>Does not correctly apply the commutative property of multiplication</code> |
181
  * Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
182
  ```json
183
  {
@@ -193,6 +189,7 @@ You can finetune this model on your own dataset.
193
  - `per_device_train_batch_size`: 16
194
  - `per_device_eval_batch_size`: 16
195
  - `num_train_epochs`: 20
 
196
  - `fp16`: True
197
  - `load_best_model_at_end`: True
198
  - `batch_sampler`: no_duplicates
@@ -221,7 +218,7 @@ You can finetune this model on your own dataset.
221
  - `max_steps`: -1
222
  - `lr_scheduler_type`: linear
223
  - `lr_scheduler_kwargs`: {}
224
- - `warmup_ratio`: 0.0
225
  - `warmup_steps`: 0
226
  - `log_level`: passive
227
  - `log_level_replica`: warning
@@ -315,52 +312,44 @@ You can finetune this model on your own dataset.
315
  </details>
316
 
317
  ### Training Logs
318
- | Epoch | Step | Training Loss |
319
- |:-------:|:-------:|:-------------:|
320
- | 0.25 | 23 | 1.0714 |
321
- | 0.5 | 46 | 0.9487 |
322
- | 0.75 | 69 | 0.8001 |
323
- | 1.0 | 92 | 0.7443 |
324
- | 1.25 | 115 | 0.3951 |
325
- | 1.5 | 138 | 0.3903 |
326
- | 1.75 | 161 | 0.3867 |
327
- | 2.0 | 184 | 0.3386 |
328
- | 2.25 | 207 | 0.2206 |
329
- | 2.5 | 230 | 0.2051 |
330
- | 2.75 | 253 | 0.2098 |
331
- | 3.0 | 276 | 0.1989 |
332
- | 3.25 | 299 | 0.1486 |
333
- | 3.5 | 322 | 0.1463 |
334
- | 3.75 | 345 | 0.1453 |
335
- | 4.0 | 368 | 0.1237 |
336
- | 4.25 | 391 | 0.0956 |
337
- | 4.5 | 414 | 0.0939 |
338
- | 4.75 | 437 | 0.1115 |
339
- | 5.0 | 460 | 0.0925 |
340
- | 5.25 | 483 | 0.0778 |
341
- | 5.5 | 506 | 0.0744 |
342
- | 5.75 | 529 | 0.09 |
343
- | 6.0 | 552 | 0.0782 |
344
- | 6.25 | 575 | 0.0454 |
345
- | 6.5 | 598 | 0.0697 |
346
- | 6.75 | 621 | 0.059 |
347
- | 7.0 | 644 | 0.033 |
348
- | 7.25 | 667 | 0.0309 |
349
- | 7.5 | 690 | 0.0548 |
350
- | 7.75 | 713 | 0.0605 |
351
- | **8.0** | **736** | **0.0431** |
352
- | 8.25 | 759 | 0.0224 |
353
- | 8.5 | 782 | 0.0381 |
354
- | 8.75 | 805 | 0.0451 |
355
- | 9.0 | 828 | 0.0169 |
356
- | 9.25 | 851 | 0.0228 |
357
- | 9.5 | 874 | 0.0257 |
358
 
359
  * The bold row denotes the saved checkpoint.
360
 
361
  ### Framework Versions
362
  - Python: 3.10.14
363
- - Sentence Transformers: 3.1.0
364
  - Transformers: 4.44.0
365
  - PyTorch: 2.4.0
366
  - Accelerate: 0.33.0
 
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
+ - dataset_size:2442
11
  - loss:MultipleNegativesSymmetricRankingLoss
12
  widget:
13
+ - source_sentence: Carry out a subtraction problem with positive integers where the
14
+ answer is less than 0 598-1000= This problem cannot be solved
 
 
 
 
 
 
15
  sentences:
16
+ - Rounds to the wrong degree of accuracy (rounds too much)
17
+ - When subtracting fractions, subtracts the numerators and denominators
18
+ - Believes it is impossible to subtract a bigger number from a smaller number
19
+ - source_sentence: Given the sketch of a curve in the form (x + a)(x + b), work out
20
+ its factorised form Which of the following could be the equation of this curve?
21
+ ![A graph of a quadratic curve that crosses the x axis at (1,0) and (3,0) and
22
+ crosses the y axis at (0,3).]() y=(x+1)(x+3)
 
23
  sentences:
24
+ - Does not use the associative property of multiplication to find other factors
25
+ of a number
26
+ - Believes they only need to multiply the first and last pairs of terms when expanding
27
+ double brackets
28
+ - Forgets to swap the sign of roots when placing into brackets
29
+ - source_sentence: For a given output find the input of a function machine ![Image
30
+ of a function machine. The function is add one third, and the output is 7]() What
31
+ is the input of this function machine? 7 1/3
32
  sentences:
33
+ - When finding an input of a function machine thinks you apply the operations given
34
+ rather than the inverse operation.
35
+ - Believes the solution to mx + c = a is the y intercept of y = mx +c
36
+ - Squares when asked to find the square root
37
+ - source_sentence: Count a number of objects 1,3,5,7, … ? Which pattern matches the
38
+ sequence above? ![A sequence of 4 patterns. The first pattern is 1 green dot.
39
+ The second pattern is green dots arranged in a 2 by 2 square shape. The third
40
+ pattern is green dots arranged in a 3 by 3 square shape. The fourth pattern is
41
+ green dots arranged in a 4 by 4 square shape. ]()
42
  sentences:
43
+ - 'Subtracts instead of adds when answering worded problems '
44
+ - When multiplying a decimal less than 1 by an integer, gives an answer 10 times
45
+ smaller than it should be
46
+ - When given a linear sequence, cannot match it to a visual pattern
47
+ - source_sentence: Express one quantity as a fraction of another A group of 8 friends
48
+ share £6 equally. What fraction of the money do they each get? 1/8
 
 
 
 
49
  sentences:
50
+ - Thinks the fraction 1/n can express sharing any number of items between n people
51
+ - 'Does not understand that in the ratio 1:n the total number of parts would be
52
+ 1+n '
53
+ - Does not recognise the distributive property
54
  ---
55
 
56
  # SentenceTransformer based on BAAI/bge-large-en-v1.5
 
104
  model = SentenceTransformer("Gurveer05/bge-large-eedi-2024")
105
  # Run inference
106
  sentences = [
107
+ 'Express one quantity as a fraction of another A group of 8 friends share £6 equally. What fraction of the money do they each get? 1/8',
108
+ 'Thinks the fraction 1/n can express sharing any number of items between n people',
109
+ 'Does not recognise the distributive property',
110
  ]
111
  embeddings = model.encode(sentences)
112
  print(embeddings.shape)
 
161
  #### csv
162
 
163
  * Dataset: csv
164
+ * Size: 2,442 training samples
165
  * Columns: <code>sentence1</code> and <code>sentence2</code>
166
  * Approximate statistics based on the first 1000 samples:
167
  | | sentence1 | sentence2 |
168
  |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
169
  | type | string | string |
170
+ | details | <ul><li>min: 13 tokens</li><li>mean: 56.55 tokens</li><li>max: 306 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 15.13 tokens</li><li>max: 40 tokens</li></ul> |
171
  * Samples:
172
+ | sentence1 | sentence2 |
173
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------|
174
+ | <code>Calculate the distance travelled using a speed-time graph Here is a speed-time graph for a car. Which of the following gives the best estimate for the distance travelled between 8 and 10 seconds? ![A graph showing time in seconds on the x axis and speed in metres per second on the y axis. The curve passes through the points (8,15) and (10,24)]() 48 m</code> | <code>Believes that when finding area under graph you can use the upper y value rather than average of upper and lower</code> |
175
+ | <code>Add proper fractions with the same denominator Work out: 4/11+7/11 Write your answer in its simplest form. 11/11</code> | <code>Forgot to simplify the fraction</code> |
176
+ | <code>Count a number of objects 1,3,5,7, ? Which pattern matches the sequence above? ![A sequence of 4 patterns. The first pattern is 1 green dot. The second pattern is green dots arranged in a 2 by 2 square shape. The third pattern is green dots arranged in a 3 by 3 square shape. The fourth pattern is green dots arranged in a 4 by 4 square shape. ]()</code> | <code>When given a linear sequence, cannot match it to a visual pattern</code> |
177
  * Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
178
  ```json
179
  {
 
189
  - `per_device_train_batch_size`: 16
190
  - `per_device_eval_batch_size`: 16
191
  - `num_train_epochs`: 20
192
+ - `warmup_ratio`: 0.1
193
  - `fp16`: True
194
  - `load_best_model_at_end`: True
195
  - `batch_sampler`: no_duplicates
 
218
  - `max_steps`: -1
219
  - `lr_scheduler_type`: linear
220
  - `lr_scheduler_kwargs`: {}
221
+ - `warmup_ratio`: 0.1
222
  - `warmup_steps`: 0
223
  - `log_level`: passive
224
  - `log_level_replica`: warning
 
312
  </details>
313
 
314
  ### Training Logs
315
+ | Epoch | Step | Training Loss |
316
+ |:---------:|:-------:|:-------------:|
317
+ | 0.3766 | 29 | 1.4411 |
318
+ | 0.7532 | 58 | 1.0084 |
319
+ | 1.1299 | 87 | 0.7363 |
320
+ | 1.5065 | 116 | 0.5658 |
321
+ | 1.8831 | 145 | 0.4697 |
322
+ | 2.2597 | 174 | 0.307 |
323
+ | 2.6364 | 203 | 0.2828 |
324
+ | 3.0130 | 232 | 0.1616 |
325
+ | 3.3896 | 261 | 0.1542 |
326
+ | 3.7662 | 290 | 0.1315 |
327
+ | 4.1429 | 319 | 0.0984 |
328
+ | 4.5195 | 348 | 0.1066 |
329
+ | 4.8961 | 377 | 0.0768 |
330
+ | 5.2727 | 406 | 0.0641 |
331
+ | 5.6494 | 435 | 0.0558 |
332
+ | 6.0260 | 464 | 0.0495 |
333
+ | 6.4026 | 493 | 0.0459 |
334
+ | 6.7792 | 522 | 0.0397 |
335
+ | 7.1558 | 551 | 0.0255 |
336
+ | 7.5325 | 580 | 0.0278 |
337
+ | 7.9091 | 609 | 0.0237 |
338
+ | 8.2857 | 638 | 0.0238 |
339
+ | 8.6623 | 667 | 0.0248 |
340
+ | **9.039** | **696** | **0.0158** |
341
+ | 9.4156 | 725 | 0.0176 |
342
+ | 9.7922 | 754 | 0.017 |
343
+ | 10.1688 | 783 | 0.0116 |
344
+ | 10.5455 | 812 | 0.0192 |
345
+ | 10.9221 | 841 | 0.0076 |
346
+ | 11.2987 | 870 | 0.009 |
 
 
 
 
 
 
 
 
347
 
348
  * The bold row denotes the saved checkpoint.
349
 
350
  ### Framework Versions
351
  - Python: 3.10.14
352
+ - Sentence Transformers: 3.1.1
353
  - Transformers: 4.44.0
354
  - PyTorch: 2.4.0
355
  - Accelerate: 0.33.0
config_sentence_transformers.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "__version__": {
3
- "sentence_transformers": "3.1.0",
4
  "transformers": "4.44.0",
5
  "pytorch": "2.4.0"
6
  },
 
1
  {
2
  "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
  "transformers": "4.44.0",
5
  "pytorch": "2.4.0"
6
  },
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b22e8ee80d2523e7569113f4093e1b74199500b33f7ac4e7f69b23a04e6cdaac
3
  size 1340612432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a05fe01c79e9d58438063e8a0f24a4341a0671378aaa11eee7fa7a304ce60e5
3
  size 1340612432