langcache-embed-v3 / README.md
radoslavralev's picture
Add new SentenceTransformer model
e1e7f17 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - biencoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:483820
  - loss:MultipleNegativesSymmetricRankingLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
  - source_sentence: >-
      See Precambrian time scale # Proposed Geologic timeline for another set of
      periods 4600 -- 541 MYA .
    sentences:
      - >-
        In 2014 election , Biju Janata Dal candidate Tathagat Satapathy
        Bharatiya Janata party candidate Rudra Narayan Pany defeated with a
        margin of 1.37,340 votes .
      - >-
        In Scotland , the Strathclyde Partnership for Transport , formerly known
        as Strathclyde Passenger Transport Executive , comprises the former
        Strathclyde region , which includes the urban area around Glasgow .
      - >-
        See Precambrian Time Scale # Proposed Geological Timeline for another
        set of periods of 4600 -- 541 MYA .
  - source_sentence: >-
      It is also 5 kilometers northeast of Tamaqua , 27 miles south of Allentown
      and 9 miles northwest of Hazleton .
    sentences:
      - In 1948 he moved to Massachusetts , and eventually settled in Vermont .
      - >-
        Suddenly I remembered that I was a New Zealander , I caught the first
        plane home and came back .
      - >-
        It is also 5 miles northeast of Tamaqua , 27 miles south of Allentown ,
        and 9 miles northwest of Hazleton .
  - source_sentence: >-
      The party has a Member of Parliament , a member of the House of Lords ,
      three members of the London Assembly and two Members of the European
      Parliament .
    sentences:
      - >-
        The party has one Member of Parliament , one member of the House of
        Lords , three Members of the London Assembly and two Members of the
        European Parliament .
      - >-
        Grapsid crabs dominate in Australia , Malaysia and Panama , while
        gastropods Cerithidea scalariformis and Melampus coeffeus are important
        seed predators in Florida mangroves .
      - >-
        Music Story is a music service website and international music data
        provider that curates , aggregates and analyses metadata for digital
        music services .
  - source_sentence: >-
      The play received two 1969 Tony Award nominations : Best Actress in a Play
      ( Michael Annals ) and Best Costume Design ( Charlotte Rae ) .
    sentences:
      - >-
        Ravishanker is a fellow of the International Statistical Institute and
        an elected member of the American Statistical Association .
      - >-
        In 1969 , the play received two Tony - Award nominations : Best Actress
        in a Theatre Play ( Michael Annals ) and Best Costume Design ( Charlotte
        Rae ) .
      - >-
        AMD and Nvidia both have proprietary methods of scaling , CrossFireX for
        AMD , and SLI for Nvidia .
  - source_sentence: >-
      He was a close friend of Ángel Cabrera and is a cousin of golfer Tony
      Croatto .
    sentences:
      - >-
        He was a close friend of Ángel Cabrera , and is a cousin of golfer Tony
        Croatto .
      - >-
        Eugenijus Bartulis ( born December 7 , 1949 in Kaunas ) is a Lithuanian
        Roman Catholic priest , and Bishop of Šiauliai .
      - >-
        UWIRE also distributes its members content to professional media outlets
        , including Yahoo , CNN and CBS News .
datasets:
  - redis/langcache-sentencepairs-v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - cosine_accuracy_threshold
  - cosine_f1
  - cosine_f1_threshold
  - cosine_precision
  - cosine_recall
  - cosine_ap
  - cosine_mcc
model-index:
  - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
    results:
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: test
          type: test
        metrics:
          - type: cosine_accuracy
            value: 0.7276245142774221
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.8017503619194031
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 0.723032161181329
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.7345461845397949
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 0.6233076217703221
            name: Cosine Precision
          - type: cosine_recall
            value: 0.8607448789571694
            name: Cosine Recall
          - type: cosine_ap
            value: 0.7251364855292874
            name: Cosine Ap
          - type: cosine_mcc
            value: 0.4684913821533736
            name: Cosine Mcc

Redis fine-tuned BiEncoder model for semantic caching on LangCache

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
    'He was a close friend of Ángel Cabrera and is a cousin of golfer Tony Croatto .',
    'He was a close friend of Ángel Cabrera , and is a cousin of golfer Tony Croatto .',
    'UWIRE also distributes its members content to professional media outlets , including Yahoo , CNN and CBS News .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[0.9961, 0.9961, 0.1250],
#         [0.9961, 0.9961, 0.1162],
#         [0.1250, 0.1162, 1.0078]], dtype=torch.bfloat16)

Evaluation

Metrics

Binary Classification

Metric Value
cosine_accuracy 0.7276
cosine_accuracy_threshold 0.8018
cosine_f1 0.723
cosine_f1_threshold 0.7345
cosine_precision 0.6233
cosine_recall 0.8607
cosine_ap 0.7251
cosine_mcc 0.4685

Training Details

Training Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 26,850 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 8 tokens
    • mean: 27.35 tokens
    • max: 53 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 52 tokens
    • 1: 100.00%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
    The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 . The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon . 1
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 26,850 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 8 tokens
    • mean: 27.35 tokens
    • max: 53 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 52 tokens
    • 1: 100.00%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
    The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 . The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon . 1
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 100
  • per_device_eval_batch_size: 100
  • learning_rate: 0.0001
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_steps: 200000
  • warmup_steps: 1000
  • load_best_model_at_end: True
  • optim: adamw_torch
  • ddp_find_unused_parameters: False
  • push_to_hub: True
  • hub_model_id: redis/langcache-embed-v3
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 100
  • per_device_eval_batch_size: 100
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0001
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 200000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 1000
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: redis/langcache-embed-v3
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss test_cosine_ap
-1 -1 - - 0.6476
0.2067 1000 0.0165 0.1033 0.6705
0.4133 2000 0.0067 0.0977 0.6597
0.6200 3000 0.0061 0.0955 0.6670
0.8266 4000 0.0063 0.0945 0.6678
1.0333 5000 0.0059 0.0950 0.6786
1.2399 6000 0.0054 0.0880 0.6779
1.4466 7000 0.0054 0.0876 0.6791
1.6532 8000 0.0054 0.0833 0.6652
1.8599 9000 0.0051 0.0821 0.6760
2.0665 10000 0.0048 0.0818 0.6767
2.2732 11000 0.0044 0.0796 0.6732
2.4799 12000 0.0048 0.0790 0.6717
2.6865 13000 0.0043 0.0804 0.6748
2.8932 14000 0.0048 0.0790 0.6745
3.0998 15000 0.0033 0.0775 0.6693
3.3065 16000 0.0044 0.0769 0.6767
3.5131 17000 0.005 0.0770 0.6768
3.7198 18000 0.0044 0.0760 0.6761
3.9264 19000 0.0039 0.0741 0.6799
4.1331 20000 0.0044 0.0750 0.6888
4.3397 21000 0.0041 0.0751 0.7019
4.5464 22000 0.0044 0.0707 0.7009
4.7530 23000 0.0039 0.0726 0.7041
4.9597 24000 0.0042 0.0712 0.6971
5.1664 25000 0.0038 0.0718 0.6978
5.3730 26000 0.004 0.0703 0.7035
5.5797 27000 0.004 0.0706 0.6976
5.7863 28000 0.0042 0.0699 0.6964
5.9930 29000 0.0044 0.0699 0.6911
6.1996 30000 0.0035 0.0702 0.6791
6.4063 31000 0.0035 0.0690 0.6955
6.6129 32000 0.0037 0.0693 0.6917
6.8196 33000 0.0035 0.0691 0.6972
7.0262 34000 0.004 0.0695 0.7083
7.2329 35000 0.0037 0.0690 0.6994
7.4396 36000 0.0036 0.0670 0.7060
7.6462 37000 0.0042 0.0682 0.6963
7.8529 38000 0.0037 0.0678 0.7049
8.0595 39000 0.0039 0.0682 0.7014
8.2662 40000 0.0039 0.0684 0.6969
8.4728 41000 0.0041 0.0677 0.7007
8.6795 42000 0.0038 0.0671 0.7126
8.8861 43000 0.0035 0.0684 0.7150
9.0928 44000 0.0035 0.0671 0.7043
9.2994 45000 0.0038 0.0681 0.7021
9.5061 46000 0.0038 0.0687 0.7129
9.7128 47000 0.0038 0.0684 0.7215
9.9194 48000 0.0039 0.0668 0.7179
10.1261 49000 0.0031 0.0661 0.7129
10.3327 50000 0.0033 0.0664 0.7119
10.5394 51000 0.0034 0.0668 0.7162
10.7460 52000 0.0038 0.0666 0.7181
10.9527 53000 0.0034 0.0674 0.7046
11.1593 54000 0.0034 0.0657 0.7100
11.3660 55000 0.0035 0.0656 0.7163
11.5726 56000 0.0034 0.0656 0.7003
11.7793 57000 0.0036 0.0643 0.7009
11.9859 58000 0.0038 0.0649 0.7166
12.1926 59000 0.0039 0.0659 0.7168
12.3993 60000 0.0039 0.0647 0.7080
12.6059 61000 0.0032 0.0649 0.7114
12.8126 62000 0.0034 0.0646 0.7165
13.0192 63000 0.0034 0.0654 0.7197
13.2259 64000 0.0035 0.0657 0.7179
13.4325 65000 0.0031 0.0652 0.7107
13.6392 66000 0.0032 0.0649 0.7089
13.8458 67000 0.0034 0.0655 0.7089
14.0525 68000 0.0031 0.0668 0.7163
14.2591 69000 0.0035 0.0644 0.7213
14.4658 70000 0.0035 0.0634 0.7057
14.6725 71000 0.0035 0.0635 0.7049
14.8791 72000 0.0033 0.0627 0.7094
15.0858 73000 0.0037 0.0620 0.7140
15.2924 74000 0.0035 0.0628 0.7237
15.4991 75000 0.003 0.0625 0.7127
15.7057 76000 0.0036 0.0635 0.7127
15.9124 77000 0.0037 0.0621 0.7104
16.1190 78000 0.0033 0.0624 0.7132
16.3257 79000 0.0035 0.0632 0.7132
16.5323 80000 0.003 0.0626 0.7193
16.7390 81000 0.0033 0.0628 0.7179
16.9456 82000 0.0036 0.0630 0.7210
17.1523 83000 0.0033 0.0628 0.7222
17.3590 84000 0.0034 0.0629 0.7226
17.5656 85000 0.0029 0.0621 0.7207
17.7723 86000 0.0032 0.0618 0.7182
17.9789 87000 0.0034 0.0620 0.7177
18.1856 88000 0.0034 0.0625 0.7148
18.3922 89000 0.0032 0.0624 0.7131
18.5989 90000 0.0032 0.0622 0.7126
18.8055 91000 0.0031 0.0617 0.7185
19.0122 92000 0.0032 0.0620 0.7231
19.2188 93000 0.0028 0.0623 0.7202
19.4255 94000 0.003 0.0625 0.7194
19.6322 95000 0.003 0.0619 0.7139
19.8388 96000 0.0031 0.0621 0.7151
20.0455 97000 0.0031 0.0617 0.7188
20.2521 98000 0.0031 0.0619 0.7161
20.4588 99000 0.0027 0.0612 0.7164
20.6654 100000 0.0033 0.0616 0.7173
20.8721 101000 0.0033 0.0614 0.7182
21.0787 102000 0.003 0.0611 0.7194
21.2854 103000 0.0031 0.0614 0.7191
21.4920 104000 0.0031 0.0615 0.7187
21.6987 105000 0.0035 0.0609 0.7143
21.9054 106000 0.0033 0.0614 0.7180
22.1120 107000 0.0029 0.0608 0.7215
22.3187 108000 0.0032 0.0609 0.7250
22.5253 109000 0.0029 0.0611 0.7248
22.7320 110000 0.003 0.0612 0.7224
22.9386 111000 0.0029 0.0612 0.7180
23.1453 112000 0.0032 0.0610 0.7169
23.3519 113000 0.0032 0.0609 0.7174
23.5586 114000 0.0028 0.0613 0.7204
23.7652 115000 0.0033 0.0613 0.7222
23.9719 116000 0.0033 0.0613 0.7240
24.1785 117000 0.003 0.0610 0.7244
24.3852 118000 0.0027 0.0613 0.7239
24.5919 119000 0.0028 0.0615 0.7248
24.7985 120000 0.003 0.0608 0.7259
25.0052 121000 0.0033 0.0605 0.7270
25.2118 122000 0.0035 0.0604 0.7240
25.4185 123000 0.003 0.0607 0.7245
25.6251 124000 0.003 0.0608 0.7238
25.8318 125000 0.0032 0.0605 0.7208
26.0384 126000 0.0029 0.0605 0.7208
26.2451 127000 0.0034 0.0603 0.7212
26.4517 128000 0.003 0.0605 0.7222
26.6584 129000 0.003 0.0604 0.7236
26.8651 130000 0.003 0.0608 0.7271
27.0717 131000 0.0028 0.0608 0.7242
27.2784 132000 0.0028 0.0612 0.7239
27.4850 133000 0.0025 0.0609 0.7270
27.6917 134000 0.0026 0.0607 0.7277
27.8983 135000 0.003 0.0608 0.7263
28.1050 136000 0.003 0.0609 0.7250
28.3116 137000 0.0029 0.0607 0.7262
28.5183 138000 0.0029 0.0609 0.7269
28.7249 139000 0.0029 0.0607 0.7250
28.9316 140000 0.0025 0.0608 0.7254
29.1383 141000 0.0031 0.0609 0.7262
29.3449 142000 0.0027 0.0606 0.7247
29.5516 143000 0.003 0.0607 0.7244
29.7582 144000 0.0028 0.0606 0.7240
29.9649 145000 0.0028 0.0605 0.7228
30.1715 146000 0.0032 0.0604 0.7251
30.3782 147000 0.0033 0.0603 0.7240
30.5848 148000 0.0029 0.0604 0.7242
30.7915 149000 0.0032 0.0603 0.7241
30.9981 150000 0.0028 0.0602 0.7246
31.2048 151000 0.0029 0.0602 0.7261
31.4114 152000 0.003 0.0602 0.7258
31.6181 153000 0.0031 0.0603 0.7253
31.8248 154000 0.003 0.0602 0.7250
32.0314 155000 0.0033 0.0602 0.7248
32.2381 156000 0.0031 0.0601 0.7248
32.4447 157000 0.0027 0.0602 0.7240
32.6514 158000 0.0026 0.0602 0.7243
32.8580 159000 0.0028 0.0602 0.7249
33.0647 160000 0.0033 0.0602 0.7251
33.2713 161000 0.0031 0.0602 0.7252
33.4780 162000 0.0027 0.0600 0.7247
33.6846 163000 0.0031 0.0601 0.7247
33.8913 164000 0.0032 0.0601 0.7251
34.0980 165000 0.0026 0.0602 0.7252
34.3046 166000 0.0034 0.0602 0.7252
34.5113 167000 0.0028 0.0602 0.7250
34.7179 168000 0.0029 0.0601 0.7249
34.9246 169000 0.0028 0.0602 0.7253
35.1312 170000 0.0026 0.0601 0.7249
35.3379 171000 0.0027 0.0601 0.7247
35.5445 172000 0.0031 0.0601 0.7245
35.7512 173000 0.003 0.0600 0.7245
35.9578 174000 0.003 0.0601 0.7250
36.1645 175000 0.0027 0.0600 0.7246
36.3712 176000 0.0028 0.0601 0.7248
36.5778 177000 0.0027 0.0601 0.7250
36.7845 178000 0.0028 0.0601 0.7252
36.9911 179000 0.0029 0.0601 0.7252
37.1978 180000 0.0029 0.0602 0.7251
37.4044 181000 0.0025 0.0601 0.7250
37.6111 182000 0.003 0.0601 0.7250
37.8177 183000 0.0028 0.0601 0.7251
38.0244 184000 0.0028 0.0601 0.7252
38.2310 185000 0.0034 0.0600 0.7251
38.4377 186000 0.0028 0.0601 0.7251
38.6443 187000 0.0035 0.0601 0.7250
38.8510 188000 0.003 0.0600 0.7250
39.0577 189000 0.0028 0.0601 0.7252
39.2643 190000 0.0027 0.0600 0.7250
39.4710 191000 0.0026 0.0601 0.7250
39.6776 192000 0.0028 0.0600 0.7251
39.8843 193000 0.0027 0.0600 0.7251
40.0909 194000 0.0031 0.0601 0.7252
40.2976 195000 0.0031 0.0600 0.7252
40.5042 196000 0.0029 0.0601 0.7251
40.7109 197000 0.0032 0.0600 0.7251
40.9175 198000 0.0028 0.0600 0.7251
41.1242 199000 0.0029 0.0600 0.7252
41.3309 200000 0.003 0.0600 0.7251
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}