CrossEncoder based on Alibaba-NLP/gte-multilingual-base

This is a Cross Encoder model finetuned from Alibaba-NLP/gte-multilingual-base on the msmarco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("skfrost19/reranker-gte-multilingual-base-msmarco-bce")
# Get scores for pairs of texts
pairs = [
    ['what symptoms might a patient with a tmd have', 'TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinsonâ\x80\x99s disease (PD) symptoms.'],
    ['what is a thermal protector', 'The word hero comes from the Greek á¼¥Ï\x81Ï\x89Ï\x82 (hÄ\x93rÅ\x8ds), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender.'],
    ['how many copies of call of duty wwii sold', 'Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360.'],
    ['what is the desired temperature for the fresh food compartment in a refrigerator', 'A refrigerator maintains a temperature a few degrees above the freezing point of water. Optimum temperature range for perishable food storage is 3 to 5 °C (37 to 41 °F).emperature settings for refrigerator and freezer compartments are often given arbitrary numbers by manufacturers (for example, 1 through 9, warmest to coldest), but generally 3 to 5 °C (37 to 41 °F) is ideal for the refrigerator compartment and â\x88\x9218 °C (0 °F) for the freezer.'],
    ['what is gsm alarm system', 'Iâ\x80\x99m sure you would have these questions in your mind when you heard GSM alarm system at the first time. GSM alarm system is an alarm system that operating through GSM (global system for mobile communications) network; not requiring a telephone line.urthermore, in the case of burglar entering the premises and cutting the telephone line, the GSM alarm would not be affected and still work as it does not require the use of a fixed phone line. So this security alarm is ideal for the place where no fixed phone line or hard to get one.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'what symptoms might a patient with a tmd have',
    [
        'TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinsonâ\x80\x99s disease (PD) symptoms.',
        'The word hero comes from the Greek á¼¥Ï\x81Ï\x89Ï\x82 (hÄ\x93rÅ\x8ds), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender.',
        'Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360.',
        'A refrigerator maintains a temperature a few degrees above the freezing point of water. Optimum temperature range for perishable food storage is 3 to 5 °C (37 to 41 °F).emperature settings for refrigerator and freezer compartments are often given arbitrary numbers by manufacturers (for example, 1 through 9, warmest to coldest), but generally 3 to 5 °C (37 to 41 °F) is ideal for the refrigerator compartment and â\x88\x9218 °C (0 °F) for the freezer.',
        'Iâ\x80\x99m sure you would have these questions in your mind when you heard GSM alarm system at the first time. GSM alarm system is an alarm system that operating through GSM (global system for mobile communications) network; not requiring a telephone line.urthermore, in the case of burglar entering the premises and cutting the telephone line, the GSM alarm would not be affected and still work as it does not require the use of a fixed phone line. So this security alarm is ideal for the place where no fixed phone line or hard to get one.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

  • Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric NanoMSMARCO_R100 NanoNFCorpus_R100 NanoNQ_R100
map 0.6138 (+0.1242) 0.3423 (+0.0813) 0.5809 (+0.1613)
mrr@10 0.6029 (+0.1254) 0.5771 (+0.0772) 0.5987 (+0.1720)
ndcg@10 0.6561 (+0.1157) 0.3777 (+0.0527) 0.6548 (+0.1541)

Cross Encoder Nano BEIR

  • Dataset: NanoBEIR_R100_mean
  • Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ],
        "rerank_k": 100,
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric Value
map 0.5123 (+0.1223)
mrr@10 0.5929 (+0.1249)
ndcg@10 0.5629 (+0.1075)

Training Details

Training Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 1,990,000 training samples
  • Columns: query, passage, and score
  • Approximate statistics based on the first 1000 samples:
    query passage score
    type string string float
    details
    • min: 11 characters
    • mean: 34.61 characters
    • max: 124 characters
    • min: 82 characters
    • mean: 357.43 characters
    • max: 1034 characters
    • min: 0.0
    • mean: 0.49
    • max: 1.0
  • Samples:
    query passage score
    what causes your tailbone to hurt A coccyx injury results in pain and discomfort in the tailbone area (the condition is called coccydynia). These injuries may result in a bruise, dislocation, or fracture (break) of the coccyx. Although they may be slow to heal, the majority of coccyx injuries can be managed with cautious treatment.ost tailbone injuries are caused by trauma to the coccyx area. 1 A fall onto the tailbone in the seated position, usually against a hard surface, is the most common cause of coccyx injuries. 2 A direct blow to the tailbone, such as those that occur during contact sports, can injure the coccyx. 1.0
    what muscles do trunk lateral flexion It’s the same with the External Obliques, but unlike the External Obliques, they are not visible when fully developed. Action: 1 Supports abdominal wall, assists forced respiration, aids raising intra-abdominal pressure and, with muscles of other side, abducts and rotates trunk. 2 Contraction of one side alone laterally bends the trunk to that side and rotates the trunk to the other side. 0.0
    brake horsepower definition When the brake lights will not come on, the first thing to check is the third-brake light. If it too is not working, the brake-light switch, a bad fuse or an unplugged harness is likely.ull up on the brake pedal and if the lights go out, switch mis-alignment or pedal position error is the likely cause. The final possibility is a wire shorted to power. Unplug the brake-light switch and if the lights stay on, a short circuit is the case. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 10,000 evaluation samples
  • Columns: query, passage, and score
  • Approximate statistics based on the first 1000 samples:
    query passage score
    type string string float
    details
    • min: 9 characters
    • mean: 33.72 characters
    • max: 193 characters
    • min: 55 characters
    • mean: 353.35 characters
    • max: 895 characters
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    query passage score
    what symptoms might a patient with a tmd have TMD sufferers have a long list of symptoms, including chronic pain (https://youtu.be/SvMaJb8o2RI), many of which are in common with Parkinson’s disease (PD) symptoms. 1.0
    what is a thermal protector The word hero comes from the Greek ἥρως (hērōs), hero, warrior, particularly one such as Heracles with divine ancestry or later given divine honors. literally protector or defender. 0.0
    how many copies of call of duty wwii sold Call of Duty 3. Call of Duty 3 is a World War II first-person shooter and the third installment in the Call of Duty video game series. Released on November 7, 2006, the game was developed by Treyarch, and was the first major installment in the Call of Duty series not to be developed by Infinity Ward. It was also the first not to be released on the PC platform. It was released on the PlayStation 2, PlayStation 3, Wii, Xbox, and Xbox 360. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 4
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_R100_ndcg@10 NanoNFCorpus_R100_ndcg@10 NanoNQ_R100_ndcg@10 NanoBEIR_R100_mean_ndcg@10
-1 -1 - - 0.0063 (-0.5341) 0.2009 (-0.1241) 0.0649 (-0.4357) 0.0907 (-0.3646)
0.0001 1 0.702 - - - - -
0.2573 4000 0.2125 - - - - -
0.5146 8000 0.1655 - - - - -
0.6432 10000 - 0.1367 0.6561 (+0.1157) 0.3777 (+0.0527) 0.6548 (+0.1541) 0.5629 (+0.1075)
0.7719 12000 0.1411 - - - - -
-1 -1 - - 0.6561 (+0.1157) 0.3777 (+0.0527) 0.6548 (+0.1541) 0.5629 (+0.1075)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.5
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
7
Safetensors
Model size
306M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for skfrost19/reranker-gte-multilingual-base-msmarco-bce

Finetuned
(50)
this model

Dataset used to train skfrost19/reranker-gte-multilingual-base-msmarco-bce

Evaluation results