CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L12-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the ๐Ÿค— Hub
model = CrossEncoder("Davidsamuel101/ft-ms-marco-MiniLM-L12-v2-claims-reranker-v2")
# Get scores for pairs of texts
pairs = [
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Higher carbon dioxide concentrations will favourably affect plant growth and demand for water.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', "Carbon dioxide in the Earth's atmosphere is essential to life and to most of the planetary biosphere."],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Rennie 2009: "Claim 1: Anthropogenic CO2 can\'t be changing climate, because CO2 is only a trace gas in the atmosphere and the amount produced by humans is dwarfed by the amount from volcanoes and other natural sources.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.',
    [
        'At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse.',
        'Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients.',
        'Higher carbon dioxide concentrations will favourably affect plant growth and demand for water.',
        "Carbon dioxide in the Earth's atmosphere is essential to life and to most of the planetary biosphere.",
        'Rennie 2009: "Claim 1: Anthropogenic CO2 can\'t be changing climate, because CO2 is only a trace gas in the atmosphere and the amount produced by humans is dwarfed by the amount from volcanoes and other natural sources.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric Value
map 0.9904 (-0.0096)
mrr@5 1.0000 (+0.0000)
ndcg@5 0.9882 (-0.0118)

Training Details

Training Dataset

Unnamed Dataset

  • Size: 23,770 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string int
    details
    • min: 38 characters
    • mean: 118.57 characters
    • max: 226 characters
    • min: 14 characters
    • mean: 144.96 characters
    • max: 1176 characters
    • 0: ~83.70%
    • 1: ~16.30%
  • Samples:
    text1 text2 label
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse. 1
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients. 1
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. Higher carbon dioxide concentrations will favourably affect plant growth and demand for water. 1
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 10.0,
        "num_negatives": 4,
        "activation_fn": "torch.nn.modules.activation.Sigmoid"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • learning_rate: 3e-06
  • num_train_epochs: 5
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss claims-evidence-dev_ndcg@5
0.0336 50 1.2496 -
0.0673 100 1.2605 0.9523 (-0.0477)
0.1009 150 1.1969 -
0.1346 200 1.2353 0.9529 (-0.0471)
0.1682 250 1.2114 -
0.2019 300 1.1438 0.9551 (-0.0449)
0.2355 350 1.2062 -
0.2692 400 1.1631 0.9568 (-0.0432)
0.3028 450 1.115 -
0.3365 500 1.2029 0.9582 (-0.0418)
0.3701 550 1.0615 -
0.4038 600 1.185 0.9649 (-0.0351)
0.4374 650 1.0651 -
0.4711 700 1.0951 0.9682 (-0.0318)
0.5047 750 1.1267 -
0.5384 800 1.0822 0.9727 (-0.0273)
0.5720 850 1.0658 -
0.6057 900 1.0113 0.9785 (-0.0215)
0.6393 950 1.0578 -
0.6729 1000 1.074 0.9829 (-0.0171)
0.7066 1050 1.0287 -
0.7402 1100 0.9337 0.9873 (-0.0127)
0.7739 1150 0.9798 -
0.8075 1200 0.9697 0.9899 (-0.0101)
0.8412 1250 0.984 -
0.8748 1300 0.9913 0.9898 (-0.0102)
0.9085 1350 1.0126 -
0.9421 1400 0.9458 0.9897 (-0.0103)
0.9758 1450 0.9594 -
1.0094 1500 0.9798 0.9896 (-0.0104)
1.0431 1550 0.9599 -
1.0767 1600 0.9485 0.9887 (-0.0113)
1.1104 1650 0.9021 -
1.1440 1700 0.9778 0.9887 (-0.0113)
1.1777 1750 0.9836 -
1.2113 1800 0.939 0.9912 (-0.0088)
1.2450 1850 0.9476 -
1.2786 1900 0.964 0.9914 (-0.0086)
1.3122 1950 0.9238 -
1.3459 2000 0.9811 0.9895 (-0.0105)
1.3795 2050 0.905 -
1.4132 2100 0.8979 0.9896 (-0.0104)
1.4468 2150 0.8998 -
1.4805 2200 0.9016 0.9896 (-0.0104)
1.5141 2250 0.9183 -
1.5478 2300 0.8805 0.9896 (-0.0104)
1.5814 2350 0.8672 -
1.6151 2400 0.8822 0.9896 (-0.0104)
1.6487 2450 0.8724 -
1.6824 2500 0.9397 0.9883 (-0.0117)
1.7160 2550 0.8903 -
1.7497 2600 0.9305 0.9882 (-0.0118)
1.7833 2650 0.8741 -
1.8170 2700 0.8951 0.9874 (-0.0126)
1.8506 2750 0.8958 -
1.8843 2800 0.8529 0.9873 (-0.0127)
1.9179 2850 0.9468 -
1.9515 2900 0.8683 0.9882 (-0.0118)
1.9852 2950 0.9145 -
2.0188 3000 0.9137 0.9883 (-0.0117)
2.0525 3050 0.8175 -
2.0861 3100 0.911 0.9883 (-0.0117)
2.1198 3150 0.8749 -
2.1534 3200 0.8491 0.9883 (-0.0117)
2.1871 3250 0.9057 -
2.2207 3300 0.9034 0.9882 (-0.0118)
2.2544 3350 0.8505 -
2.2880 3400 0.8762 0.9883 (-0.0117)
2.3217 3450 0.8974 -
2.3553 3500 0.8832 0.9884 (-0.0116)
2.3890 3550 0.851 -
2.4226 3600 0.8584 0.9890 (-0.0110)
2.4563 3650 0.9032 -
2.4899 3700 0.8963 0.9893 (-0.0107)
2.5236 3750 0.8756 -
2.5572 3800 0.843 0.9882 (-0.0118)
2.5908 3850 0.8778 -
2.6245 3900 0.8434 0.9882 (-0.0118)
2.6581 3950 0.9193 -
2.6918 4000 0.8724 0.9875 (-0.0125)
2.7254 4050 0.9062 -
2.7591 4100 0.8807 0.9875 (-0.0125)
2.7927 4150 0.8252 -
2.8264 4200 0.8725 0.9875 (-0.0125)
2.8600 4250 0.9094 -
2.8937 4300 0.8589 0.9874 (-0.0126)
2.9273 4350 0.8625 -
2.9610 4400 0.8138 0.9874 (-0.0126)
2.9946 4450 0.9217 -
3.0283 4500 0.8871 0.9872 (-0.0128)
3.0619 4550 0.8504 -
3.0956 4600 0.944 0.9873 (-0.0127)
3.1292 4650 0.8258 -
3.1629 4700 0.9054 0.9874 (-0.0126)
3.1965 4750 0.8297 -
3.2301 4800 0.8483 0.9875 (-0.0125)
3.2638 4850 0.909 -
3.2974 4900 0.8486 0.9892 (-0.0108)
3.3311 4950 0.8937 -
3.3647 5000 0.8821 0.9874 (-0.0126)
3.3984 5050 0.873 -
3.4320 5100 0.8773 0.9874 (-0.0126)
3.4657 5150 0.8592 -
3.4993 5200 0.8449 0.9882 (-0.0118)
3.5330 5250 0.8651 -
3.5666 5300 0.8943 0.9882 (-0.0118)
3.6003 5350 0.8535 -
3.6339 5400 0.8687 0.9882 (-0.0118)
3.6676 5450 0.9213 -
3.7012 5500 0.887 0.9882 (-0.0118)
3.7349 5550 0.8787 -
3.7685 5600 0.8466 0.9882 (-0.0118)
3.8022 5650 0.8517 -
3.8358 5700 0.8349 0.9883 (-0.0117)
3.8694 5750 0.8647 -
3.9031 5800 0.8406 0.9882 (-0.0118)
3.9367 5850 0.8385 -
3.9704 5900 0.8631 0.9882 (-0.0118)
4.0040 5950 0.823 -
4.0377 6000 0.9163 0.9881 (-0.0119)
4.0713 6050 0.8373 -
4.1050 6100 0.892 0.9882 (-0.0118)
4.1386 6150 0.8666 -
4.1723 6200 0.8536 0.9882 (-0.0118)
4.2059 6250 0.8784 -
4.2396 6300 0.9616 0.9882 (-0.0118)
4.2732 6350 0.8464 -
4.3069 6400 0.865 0.9882 (-0.0118)
4.3405 6450 0.8411 -
4.3742 6500 0.8943 0.9882 (-0.0118)
4.4078 6550 0.8577 -
4.4415 6600 0.8683 0.9882 (-0.0118)
4.4751 6650 0.8706 -
4.5087 6700 0.8645 0.9882 (-0.0118)
4.5424 6750 0.8899 -
4.5760 6800 0.8593 0.9882 (-0.0118)
4.6097 6850 0.8838 -
4.6433 6900 0.8379 0.9882 (-0.0118)
4.6770 6950 0.8759 -
4.7106 7000 0.8608 0.9882 (-0.0118)
4.7443 7050 0.8858 -
4.7779 7100 0.8594 0.9882 (-0.0118)
4.8116 7150 0.8403 -
4.8452 7200 0.8898 0.9882 (-0.0118)
4.8789 7250 0.8382 -
4.9125 7300 0.8307 0.9882 (-0.0118)
4.9462 7350 0.8601 -
4.9798 7400 0.8076 0.9882 (-0.0118)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.13.2
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu128
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
1,273
Safetensors
Model size
33.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Davidsamuel101/ft-ms-marco-MiniLM-L12-v2-claims-reranker-v2

Finetuned
(6)
this model

Evaluation results