labdmitriy commited on
Commit
7c7e274
·
verified ·
1 Parent(s): 141b86c

Add new SentenceTransformer model.

Browse files
Files changed (3) hide show
  1. README.md +58 -31
  2. config.json +1 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  - generated_from_trainer
7
  - dataset_size:208
8
  - loss:BatchSemiHardTripletLoss
9
- base_model: BAAI/bge-base-en
10
  widget:
11
  - source_sentence: '
12
 
@@ -362,7 +362,7 @@ metrics:
362
  - euclidean_accuracy
363
  - max_accuracy
364
  model-index:
365
- - name: SentenceTransformer based on BAAI/bge-base-en
366
  results:
367
  - task:
368
  type: triplet
@@ -372,19 +372,19 @@ model-index:
372
  type: bge-base-en-v1.5-train
373
  metrics:
374
  - type: cosine_accuracy
375
- value: 0.8269230769230769
376
  name: Cosine Accuracy
377
  - type: dot_accuracy
378
- value: 0.17307692307692307
379
  name: Dot Accuracy
380
  - type: manhattan_accuracy
381
- value: 0.8269230769230769
382
  name: Manhattan Accuracy
383
  - type: euclidean_accuracy
384
- value: 0.8269230769230769
385
  name: Euclidean Accuracy
386
  - type: max_accuracy
387
- value: 0.8269230769230769
388
  name: Max Accuracy
389
  - task:
390
  type: triplet
@@ -394,31 +394,46 @@ model-index:
394
  type: bge-base-en-v1.5-eval
395
  metrics:
396
  - type: cosine_accuracy
397
- value: 0.9848484848484849
398
  name: Cosine Accuracy
399
  - type: dot_accuracy
400
- value: 0.015151515151515152
401
  name: Dot Accuracy
402
  - type: manhattan_accuracy
403
- value: 0.9696969696969697
404
  name: Manhattan Accuracy
405
  - type: euclidean_accuracy
406
- value: 0.9848484848484849
407
  name: Euclidean Accuracy
408
  - type: max_accuracy
409
- value: 0.9848484848484849
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
410
  name: Max Accuracy
411
  ---
412
 
413
- # SentenceTransformer based on BAAI/bge-base-en
414
 
415
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
416
 
417
  ## Model Details
418
 
419
  ### Model Description
420
  - **Model Type:** Sentence Transformer
421
- - **Base model:** [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) <!-- at revision b737bf5dcc6ee8bdc530531266b4804a5d77b5d8 -->
422
  - **Maximum Sequence Length:** 512 tokens
423
  - **Output Dimensionality:** 768 tokens
424
  - **Similarity Function:** Cosine Similarity
@@ -506,25 +521,37 @@ You can finetune this model on your own dataset.
506
  * Dataset: `bge-base-en-v1.5-train`
507
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
508
 
509
- | Metric | Value |
510
- |:-------------------|:-----------|
511
- | cosine_accuracy | 0.8269 |
512
- | dot_accuracy | 0.1731 |
513
- | manhattan_accuracy | 0.8269 |
514
- | euclidean_accuracy | 0.8269 |
515
- | **max_accuracy** | **0.8269** |
 
 
 
 
 
 
 
 
 
 
 
 
516
 
517
  #### Triplet
518
  * Dataset: `bge-base-en-v1.5-eval`
519
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
520
 
521
- | Metric | Value |
522
- |:-------------------|:-----------|
523
- | cosine_accuracy | 0.9848 |
524
- | dot_accuracy | 0.0152 |
525
- | manhattan_accuracy | 0.9697 |
526
- | euclidean_accuracy | 0.9848 |
527
- | **max_accuracy** | **0.9848** |
528
 
529
  <!--
530
  ## Bias, Risks and Limitations
@@ -713,8 +740,8 @@ You can finetune this model on your own dataset.
713
  ### Training Logs
714
  | Epoch | Step | bge-base-en-v1.5-eval_max_accuracy | bge-base-en-v1.5-train_max_accuracy |
715
  |:-----:|:----:|:----------------------------------:|:-----------------------------------:|
716
- | 0 | 0 | - | 0.8269 |
717
- | 5.0 | 65 | 0.9848 | - |
718
 
719
 
720
  ### Framework Versions
 
6
  - generated_from_trainer
7
  - dataset_size:208
8
  - loss:BatchSemiHardTripletLoss
9
+ base_model: BAAI/bge-base-en-v1.5
10
  widget:
11
  - source_sentence: '
12
 
 
362
  - euclidean_accuracy
363
  - max_accuracy
364
  model-index:
365
+ - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
366
  results:
367
  - task:
368
  type: triplet
 
372
  type: bge-base-en-v1.5-train
373
  metrics:
374
  - type: cosine_accuracy
375
+ value: 0.8461538461538461
376
  name: Cosine Accuracy
377
  - type: dot_accuracy
378
+ value: 0.15384615384615385
379
  name: Dot Accuracy
380
  - type: manhattan_accuracy
381
+ value: 0.8509615384615384
382
  name: Manhattan Accuracy
383
  - type: euclidean_accuracy
384
+ value: 0.8461538461538461
385
  name: Euclidean Accuracy
386
  - type: max_accuracy
387
+ value: 0.8509615384615384
388
  name: Max Accuracy
389
  - task:
390
  type: triplet
 
394
  type: bge-base-en-v1.5-eval
395
  metrics:
396
  - type: cosine_accuracy
397
+ value: 1.0
398
  name: Cosine Accuracy
399
  - type: dot_accuracy
400
+ value: 0.0
401
  name: Dot Accuracy
402
  - type: manhattan_accuracy
403
+ value: 1.0
404
  name: Manhattan Accuracy
405
  - type: euclidean_accuracy
406
+ value: 1.0
407
  name: Euclidean Accuracy
408
  - type: max_accuracy
409
+ value: 1.0
410
+ name: Max Accuracy
411
+ - type: cosine_accuracy
412
+ value: 1.0
413
+ name: Cosine Accuracy
414
+ - type: dot_accuracy
415
+ value: 0.0
416
+ name: Dot Accuracy
417
+ - type: manhattan_accuracy
418
+ value: 1.0
419
+ name: Manhattan Accuracy
420
+ - type: euclidean_accuracy
421
+ value: 1.0
422
+ name: Euclidean Accuracy
423
+ - type: max_accuracy
424
+ value: 1.0
425
  name: Max Accuracy
426
  ---
427
 
428
+ # SentenceTransformer based on BAAI/bge-base-en-v1.5
429
 
430
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
431
 
432
  ## Model Details
433
 
434
  ### Model Description
435
  - **Model Type:** Sentence Transformer
436
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
437
  - **Maximum Sequence Length:** 512 tokens
438
  - **Output Dimensionality:** 768 tokens
439
  - **Similarity Function:** Cosine Similarity
 
521
  * Dataset: `bge-base-en-v1.5-train`
522
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
523
 
524
+ | Metric | Value |
525
+ |:-------------------|:----------|
526
+ | cosine_accuracy | 0.8462 |
527
+ | dot_accuracy | 0.1538 |
528
+ | manhattan_accuracy | 0.851 |
529
+ | euclidean_accuracy | 0.8462 |
530
+ | **max_accuracy** | **0.851** |
531
+
532
+ #### Triplet
533
+ * Dataset: `bge-base-en-v1.5-eval`
534
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
535
+
536
+ | Metric | Value |
537
+ |:-------------------|:--------|
538
+ | cosine_accuracy | 1.0 |
539
+ | dot_accuracy | 0.0 |
540
+ | manhattan_accuracy | 1.0 |
541
+ | euclidean_accuracy | 1.0 |
542
+ | **max_accuracy** | **1.0** |
543
 
544
  #### Triplet
545
  * Dataset: `bge-base-en-v1.5-eval`
546
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
547
 
548
+ | Metric | Value |
549
+ |:-------------------|:--------|
550
+ | cosine_accuracy | 1.0 |
551
+ | dot_accuracy | 0.0 |
552
+ | manhattan_accuracy | 1.0 |
553
+ | euclidean_accuracy | 1.0 |
554
+ | **max_accuracy** | **1.0** |
555
 
556
  <!--
557
  ## Bias, Risks and Limitations
 
740
  ### Training Logs
741
  | Epoch | Step | bge-base-en-v1.5-eval_max_accuracy | bge-base-en-v1.5-train_max_accuracy |
742
  |:-----:|:----:|:----------------------------------:|:-----------------------------------:|
743
+ | 0 | 0 | - | 0.8510 |
744
+ | 5.0 | 65 | 1.0 | - |
745
 
746
 
747
  ### Framework Versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "BAAI/bge-base-en",
3
  "architectures": [
4
  "BertModel"
5
  ],
 
1
  {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
  "architectures": [
4
  "BertModel"
5
  ],
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:77c3b053b4a39609fc2ccc2808f96eac1aae385bf74c3147f5532c3e80fdf055
3
  size 437951328
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4226550437f27f985a4aaa7684a4bfcf05baedd330b64315cbdf0882a4d02c57
3
  size 437951328