SentenceTransformer based on BAAI/bge-large-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-large-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "Represent this sentence for searching relevant passages: The study addresses the need for effective time series forecasting methods to estimate the spread of epidemics, particularly in light of the resurgence of COVID-19 cases. It highlights the importance of accurately modeling both linear and non-linear features of epidemic data to provide state authorities and health officials with reliable short-term forecasts and strategies.We suggest combining 'ARIMA' and ",
    'Transformers',
    'the human brain is able to efficiently learn effective control strategies using limited resources',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 784,827 training samples
  • Columns: query, answer, and label
  • Approximate statistics based on the first 1000 samples:
    query answer label
    type string string int
    details
    • min: 66 tokens
    • mean: 83.86 tokens
    • max: 99 tokens
    • min: 3 tokens
    • mean: 8.63 tokens
    • max: 49 tokens
    • 0: ~96.70%
    • 1: ~3.30%
  • Samples:
    query answer label
    Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and a multilayer perceptron 1
    Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and global expression information 0
    Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and some relevant physical parameters 0
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • learning_rate: 2.3351317368662443e-06
  • warmup_ratio: 0.11883406097525227
  • bf16: True
  • prompts: {'query': 'Represent this sentence for searching relevant passages: '}
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2.3351317368662443e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.11883406097525227
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: {'query': 'Represent this sentence for searching relevant passages: '}
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0082 100 0.0051
0.0163 200 0.0038
0.0245 300 0.0037
0.0326 400 0.0036
0.0408 500 0.0046
0.0489 600 0.0035
0.0571 700 0.0035
0.0652 800 0.0034
0.0734 900 0.0044
0.0815 1000 0.0034
0.0897 1100 0.0035
0.0979 1200 0.0034
0.1060 1300 0.0034
0.1142 1400 0.0045
0.1223 1500 0.0034
0.1305 1600 0.0034
0.1386 1700 0.0033
0.1468 1800 0.0043
0.1549 1900 0.0034
0.1631 2000 0.0033
0.1712 2100 0.0032
0.1794 2200 0.0033
0.1876 2300 0.0044
0.1957 2400 0.0033
0.2039 2500 0.0034
0.2120 2600 0.0033
0.2202 2700 0.0042
0.2283 2800 0.0034
0.2365 2900 0.0033
0.2446 3000 0.0033
0.2528 3100 0.0036
0.2609 3200 0.0039
0.2691 3300 0.0033
0.2773 3400 0.0032
0.2854 3500 0.0034
0.2936 3600 0.0041
0.3017 3700 0.0031
0.3099 3800 0.0032
0.3180 3900 0.0031
0.3262 4000 0.004
0.3343 4100 0.0034
0.3425 4200 0.003
0.3506 4300 0.0032
0.3588 4400 0.0032
0.3670 4500 0.004
0.3751 4600 0.0031
0.3833 4700 0.0033
0.3914 4800 0.0031
0.3996 4900 0.004
0.4077 5000 0.0032
0.4159 5100 0.0031
0.4240 5200 0.0031
0.4322 5300 0.0031
0.4403 5400 0.0039
0.4485 5500 0.003
0.4567 5600 0.003
0.4648 5700 0.0031
0.4730 5800 0.0038
0.4811 5900 0.0031
0.4893 6000 0.0032
0.4974 6100 0.0031
0.5056 6200 0.0033
0.5137 6300 0.0035
0.5219 6400 0.0031
0.5300 6500 0.0031
0.5382 6600 0.0031
0.5464 6700 0.0038
0.5545 6800 0.0031
0.5627 6900 0.003
0.5708 7000 0.0029
0.5790 7100 0.0037
0.5871 7200 0.0033
0.5953 7300 0.0031
0.6034 7400 0.003
0.6116 7500 0.003
0.6198 7600 0.004
0.6279 7700 0.0031
0.6361 7800 0.0031
0.6442 7900 0.0031
0.6524 8000 0.0039
0.6605 8100 0.0029
0.6687 8200 0.003
0.6768 8300 0.0029
0.6850 8400 0.0028
0.6931 8500 0.0036
0.7013 8600 0.0031
0.7095 8700 0.0029
0.7176 8800 0.0028
0.7258 8900 0.0035
0.7339 9000 0.0033
0.7421 9100 0.003
0.7502 9200 0.0028
0.7584 9300 0.0029
0.7665 9400 0.0035
0.7747 9500 0.003
0.7828 9600 0.0028
0.7910 9700 0.0027
0.7992 9800 0.0034
0.8073 9900 0.0032
0.8155 10000 0.003
0.8236 10100 0.0029
0.8318 10200 0.0032
0.8399 10300 0.0032
0.8481 10400 0.003
0.8562 10500 0.0029
0.8644 10600 0.0029
0.8725 10700 0.0033
0.8807 10800 0.003
0.8889 10900 0.0029
0.8970 11000 0.0028
0.9052 11100 0.0035
0.9133 11200 0.003
0.9215 11300 0.0029
0.9296 11400 0.0029
0.9378 11500 0.0029
0.9459 11600 0.0034
0.9541 11700 0.0031
0.9622 11800 0.0028
0.9704 11900 0.003
0.9786 12000 0.0035
0.9867 12100 0.0032
0.9949 12200 0.003
1.0030 12300 0.0033
1.0112 12400 0.0029
1.0193 12500 0.003
1.0275 12600 0.0029
1.0356 12700 0.0036
1.0438 12800 0.003
1.0519 12900 0.0027
1.0601 13000 0.0028
1.0683 13100 0.0028
1.0764 13200 0.0036
1.0846 13300 0.0027
1.0927 13400 0.0028
1.1009 13500 0.0029
1.1090 13600 0.0037
1.1172 13700 0.0029
1.1253 13800 0.0029
1.1335 13900 0.0027
1.1416 14000 0.0033
1.1498 14100 0.003
1.1580 14200 0.0027
1.1661 14300 0.0028
1.1743 14400 0.0026
1.1824 14500 0.0036
1.1906 14600 0.0028
1.1987 14700 0.0027
1.2069 14800 0.0029
1.2150 14900 0.0035
1.2232 15000 0.0027
1.2313 15100 0.0027
1.2395 15200 0.0027
1.2477 15300 0.0028
1.2558 15400 0.0035
1.2640 15500 0.0027
1.2721 15600 0.0027
1.2803 15700 0.0027
1.2884 15800 0.0037
1.2966 15900 0.0027
1.3047 16000 0.0027
1.3129 16100 0.0027
1.3210 16200 0.0028
1.3292 16300 0.0033
1.3374 16400 0.0026
1.3455 16500 0.0025
1.3537 16600 0.0028
1.3618 16700 0.0034
1.3700 16800 0.0027
1.3781 16900 0.0026
1.3863 17000 0.0027
1.3944 17100 0.0033
1.4026 17200 0.0027
1.4107 17300 0.0027
1.4189 17400 0.0026
1.4271 17500 0.0027
1.4352 17600 0.0034
1.4434 17700 0.0027
1.4515 17800 0.0025
1.4597 17900 0.0027
1.4678 18000 0.0031
1.4760 18100 0.0027
1.4841 18200 0.0027
1.4923 18300 0.0027
1.5004 18400 0.0027
1.5086 18500 0.0031
1.5168 18600 0.0025
1.5249 18700 0.0026
1.5331 18800 0.0027
1.5412 18900 0.0035
1.5494 19000 0.0025
1.5575 19100 0.0027
1.5657 19200 0.0026
1.5738 19300 0.0028
1.5820 19400 0.0032
1.5901 19500 0.0025
1.5983 19600 0.0027
1.6065 19700 0.0026
1.6146 19800 0.0034
1.6228 19900 0.0027
1.6309 20000 0.0027
1.6391 20100 0.0028
1.6472 20200 0.0031
1.6554 20300 0.0028
1.6635 20400 0.0025
1.6717 20500 0.0025
1.6798 20600 0.0026
1.6880 20700 0.003
1.6962 20800 0.0029
1.7043 20900 0.0027
1.7125 21000 0.0025
1.7206 21100 0.0029
1.7288 21200 0.0029
1.7369 21300 0.0027
1.7451 21400 0.0026
1.7532 21500 0.0025
1.7614 21600 0.003
1.7696 21700 0.0028
1.7777 21800 0.0024
1.7859 21900 0.0025
1.7940 22000 0.003
1.8022 22100 0.0026
1.8103 22200 0.0027
1.8185 22300 0.0027
1.8266 22400 0.0026
1.8348 22500 0.003
1.8429 22600 0.0029
1.8511 22700 0.0025
1.8593 22800 0.0026
1.8674 22900 0.0031
1.8756 23000 0.0027
1.8837 23100 0.0026
1.8919 23200 0.0025
1.9000 23300 0.0028
1.9082 23400 0.0027
1.9163 23500 0.0027
1.9245 23600 0.0027
1.9326 23700 0.0026
1.9408 23800 0.0031
1.9490 23900 0.0027
1.9571 24000 0.0027
1.9653 24100 0.0026
1.9734 24200 0.0032
1.9816 24300 0.0029
1.9897 24400 0.0026
1.9979 24500 0.0028
2.0060 24600 0.0029
2.0142 24700 0.0026
2.0223 24800 0.0027
2.0305 24900 0.0033
2.0387 25000 0.0026
2.0468 25100 0.0026
2.0550 25200 0.0024
2.0631 25300 0.0026
2.0713 25400 0.0033
2.0794 25500 0.0025
2.0876 25600 0.0026
2.0957 25700 0.0026
2.1039 25800 0.0033
2.1120 25900 0.0025
2.1202 26000 0.0026
2.1284 26100 0.0026
2.1365 26200 0.0025
2.1447 26300 0.0031
2.1528 26400 0.0026
2.1610 26500 0.0025
2.1691 26600 0.0026
2.1773 26700 0.0032
2.1854 26800 0.0026
2.1936 26900 0.0026
2.2017 27000 0.0025
2.2099 27100 0.0032
2.2181 27200 0.0025
2.2262 27300 0.0025
2.2344 27400 0.0024
2.2425 27500 0.0025
2.2507 27600 0.0033
2.2588 27700 0.0024
2.2670 27800 0.0024
2.2751 27900 0.0024
2.2833 28000 0.0033
2.2914 28100 0.0025
2.2996 28200 0.0024
2.3078 28300 0.0026
2.3159 28400 0.0024
2.3241 28500 0.0032
2.3322 28600 0.0025
2.3404 28700 0.0024
2.3485 28800 0.0024
2.3567 28900 0.0032
2.3648 29000 0.0025
2.3730 29100 0.0024
2.3811 29200 0.0024
2.3893 29300 0.0028
2.3975 29400 0.003
2.4056 29500 0.0023
2.4138 29600 0.0025
2.4219 29700 0.0024
2.4301 29800 0.0032
2.4382 29900 0.0025
2.4464 30000 0.0024
2.4545 30100 0.0023
2.4627 30200 0.003
2.4708 30300 0.0024
2.4790 30400 0.0025
2.4872 30500 0.0025
2.4953 30600 0.0025
2.5035 30700 0.0031
2.5116 30800 0.0022
2.5198 30900 0.0024
2.5279 31000 0.0024
2.5361 31100 0.0032
2.5442 31200 0.0024
2.5524 31300 0.0023
2.5605 31400 0.0025
2.5687 31500 0.0024
2.5769 31600 0.0031
2.5850 31700 0.0024
2.5932 31800 0.0024
2.6013 31900 0.0024
2.6095 32000 0.0031
2.6176 32100 0.0025
2.6258 32200 0.0025
2.6339 32300 0.0025
2.6421 32400 0.0027
2.6502 32500 0.0029
2.6584 32600 0.0024
2.6666 32700 0.0023
2.6747 32800 0.0025
2.6829 32900 0.0028
2.6910 33000 0.0026
2.6992 33100 0.0025
2.7073 33200 0.0024
2.7155 33300 0.0025
2.7236 33400 0.0026
2.7318 33500 0.0027
2.7399 33600 0.0025
2.7481 33700 0.0024
2.7563 33800 0.0028
2.7644 33900 0.0025
2.7726 34000 0.0024
2.7807 34100 0.0023
2.7889 34200 0.0027
2.7970 34300 0.0024
2.8052 34400 0.0025
2.8133 34500 0.0024
2.8215 34600 0.0024
2.8297 34700 0.0029
2.8378 34800 0.0027
2.8460 34900 0.0025
2.8541 35000 0.0023
2.8623 35100 0.0029
2.8704 35200 0.0025
2.8786 35300 0.0024
2.8867 35400 0.0024
2.8949 35500 0.0024
2.9030 35600 0.0028
2.9112 35700 0.0026
2.9194 35800 0.0023
2.9275 35900 0.0024
2.9357 36000 0.003
2.9438 36100 0.0025
2.9520 36200 0.0025
2.9601 36300 0.0024
2.9683 36400 0.0028
2.9764 36500 0.0027
2.9846 36600 0.0027
2.9927 36700 0.0025

Framework Versions

  • Python: 3.11.2
  • Sentence Transformers: 3.3.1
  • Transformers: 4.49.0
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.0.1
  • Datasets: 3.1.0
  • Tokenizers: 0.21.0

Citation

BibTeX

@misc{sternlicht2025chimeraknowledgebaseidea,
      title={CHIMERA: A Knowledge Base of Idea Recombination in Scientific Literature}, 
      author={Noy Sternlicht and Tom Hope},
      year={2025},
      eprint={2505.20779},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.20779}, 
}

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}

Quick Links

Downloads last month
23
Safetensors
Model size
335M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noystl/recomb-pred-bge-large-en

Finetuned
(36)
this model

Dataset used to train noystl/recomb-pred-bge-large-en

Collection including noystl/recomb-pred-bge-large-en