CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L12-v2 on the climate-cross-encoder-mixed-neg-v3 dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("CharlesPing/finetuned-ce-climate-multineg-v1")
# Get scores for pairs of texts
pairs = [
    ['Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.', 'Greenhouse gases, in particular carbon dioxide and methane, played a significant role during the Eocene in controlling the surface temperature.'],
    ['Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.', 'Climatic geomorphology is of limited use to study recent (Quaternary, Holocene) large climate changes since there are seldom discernible in the geomorphological record.'],
    ['Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.', 'There is also a close correlation between CO2 and temperature, where CO2 has a strong control over global temperatures in Earth history.'],
    ['Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.', 'While scientists knew of past climate change such as the ice ages, the concept of climate as unchanging was useful in the development of a general theory of what determines climate.'],
    ['Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.', 'Some long term modifications along the history of the planet have been significant, such as the incorporation of oxygen to the atmosphere.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Scientific analysis of past climates\xa0shows that greenhouse gasses, principally CO2,\xa0have controlled most ancient\xa0climate changes.',
    [
        'Greenhouse gases, in particular carbon dioxide and methane, played a significant role during the Eocene in controlling the surface temperature.',
        'Climatic geomorphology is of limited use to study recent (Quaternary, Holocene) large climate changes since there are seldom discernible in the geomorphological record.',
        'There is also a close correlation between CO2 and temperature, where CO2 has a strong control over global temperatures in Earth history.',
        'While scientists knew of past climate change such as the ice ages, the concept of climate as unchanging was useful in the development of a general theory of what determines climate.',
        'Some long term modifications along the history of the planet have been significant, such as the incorporation of oxygen to the atmosphere.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

  • Dataset: climate-rerank-multineg
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 1,
        "always_rerank_positives": false
    }
    
Metric Value
map 0.6809 (-0.3191)
mrr@1 0.6748 (-0.3252)
ndcg@1 0.6748 (-0.3252)

Training Details

Training Dataset

climate-cross-encoder-mixed-neg-v3

  • Dataset: climate-cross-encoder-mixed-neg-v3 at cd49b57
  • Size: 41,052 training samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 49 characters
    • mean: 140.03 characters
    • max: 306 characters
    • min: 4 characters
    • mean: 136.03 characters
    • max: 731 characters
    • min: 0.0
    • mean: 0.09
    • max: 1.0
  • Samples:
    query doc label
    “A leading Canadian authority on polar bears, Mitch Taylor, said: ‘We’re seeing an increase in bears that’s really unprecedented, and in places where we’re seeing a decrease in the population Warnings about the future of the polar bear are often contrasted with the fact that worldwide population estimates have increased over the past 50 years and are relatively stable today. 1.0
    “A leading Canadian authority on polar bears, Mitch Taylor, said: ‘We’re seeing an increase in bears that’s really unprecedented, and in places where we’re seeing a decrease in the population Species distribution models of recent years indicate that the deer tick, known as "I. scapularis," is pushing its distribution to higher latitudes of the Northeastern United States and Canada, as well as pushing and maintaining populations in the South Central and Northern Midwest regions of the United States. 0.0
    “A leading Canadian authority on polar bears, Mitch Taylor, said: ‘We’re seeing an increase in bears that’s really unprecedented, and in places where we’re seeing a decrease in the population Bear and deer are among the animals present. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

climate-cross-encoder-mixed-neg-v3

  • Dataset: climate-cross-encoder-mixed-neg-v3 at cd49b57
  • Size: 4,290 evaluation samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 39 characters
    • mean: 116.67 characters
    • max: 240 characters
    • min: 18 characters
    • mean: 132.92 characters
    • max: 731 characters
    • min: 0.0
    • mean: 0.09
    • max: 1.0
  • Samples:
    query doc label
    Scientific analysis of past climates shows that greenhouse gasses, principally CO2, have controlled most ancient climate changes. Greenhouse gases, in particular carbon dioxide and methane, played a significant role during the Eocene in controlling the surface temperature. 1.0
    Scientific analysis of past climates shows that greenhouse gasses, principally CO2, have controlled most ancient climate changes. Climatic geomorphology is of limited use to study recent (Quaternary, Holocene) large climate changes since there are seldom discernible in the geomorphological record. 0.0
    Scientific analysis of past climates shows that greenhouse gasses, principally CO2, have controlled most ancient climate changes. There is also a close correlation between CO2 and temperature, where CO2 has a strong control over global temperatures in Earth history. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss climate-rerank-multineg_ndcg@1
0.0390 100 0.5097 - -
0.0779 200 0.3662 - -
0.1169 300 0.3034 - -
0.1559 400 0.2655 - -
0.1949 500 0.2651 0.2262 0.6585 (-0.3415)
0.2338 600 0.2161 - -
0.2728 700 0.227 - -
0.3118 800 0.235 - -
0.3507 900 0.2243 - -
0.3897 1000 0.2081 0.2174 0.6992 (-0.3008)
0.4287 1100 0.1961 - -
0.4677 1200 0.207 - -
0.5066 1300 0.2375 - -
0.5456 1400 0.2117 - -
0.5846 1500 0.2058 0.2253 0.6748 (-0.3252)
0.6235 1600 0.2163 - -
0.6625 1700 0.2235 - -
0.7015 1800 0.2193 - -
0.7405 1900 0.1924 - -
0.7794 2000 0.2084 0.2095 0.6748 (-0.3252)
0.8184 2100 0.2113 - -
0.8574 2200 0.2276 - -
0.8963 2300 0.2071 - -
0.9353 2400 0.2374 - -
0.9743 2500 0.2173 0.2172 0.6667 (-0.3333)
1.0133 2600 0.2011 - -
1.0522 2700 0.1634 - -
1.0912 2800 0.1807 - -
1.1302 2900 0.1878 - -
1.1691 3000 0.2037 0.2147 0.6911 (-0.3089)
1.2081 3100 0.1904 - -
1.2471 3200 0.1911 - -
1.2860 3300 0.1828 - -
1.3250 3400 0.1686 - -
1.3640 3500 0.1892 0.2179 0.6992 (-0.3008)
1.4030 3600 0.188 - -
1.4419 3700 0.1691 - -
1.4809 3800 0.1946 - -
1.5199 3900 0.1938 - -
1.5588 4000 0.211 0.2088 0.6992 (-0.3008)
1.5978 4100 0.1826 - -
1.6368 4200 0.1608 - -
1.6758 4300 0.1782 - -
1.7147 4400 0.1803 - -
1.7537 4500 0.1804 0.2160 0.6911 (-0.3089)
1.7927 4600 0.1823 - -
1.8316 4700 0.1844 - -
1.8706 4800 0.1727 - -
1.9096 4900 0.1937 - -
1.9486 5000 0.1662 0.2219 0.6829 (-0.3171)
1.9875 5100 0.1653 - -
2.0265 5200 0.1658 - -
2.0655 5300 0.1316 - -
2.1044 5400 0.1379 - -
2.1434 5500 0.152 0.2513 0.6504 (-0.3496)
2.1824 5600 0.1848 - -
2.2214 5700 0.1507 - -
2.2603 5800 0.1495 - -
2.2993 5900 0.1469 - -
2.3383 6000 0.1596 0.2407 0.6585 (-0.3415)
2.3772 6100 0.1518 - -
2.4162 6200 0.1351 - -
2.4552 6300 0.1706 - -
2.4942 6400 0.1538 - -
2.5331 6500 0.1329 0.2505 0.6911 (-0.3089)
2.5721 6600 0.147 - -
2.6111 6700 0.1289 - -
2.6500 6800 0.1698 - -
2.6890 6900 0.1456 - -
2.7280 7000 0.141 0.2618 0.6748 (-0.3252)
2.7670 7100 0.1413 - -
2.8059 7200 0.1474 - -
2.8449 7300 0.1381 - -
2.8839 7400 0.1252 - -
2.9228 7500 0.1384 0.2608 0.6748 (-0.3252)
2.9618 7600 0.1826 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
32
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CharlesPing/finetuned-ce-climate-multineg-v1

Finetuned
(6)
this model

Dataset used to train CharlesPing/finetuned-ce-climate-multineg-v1

Evaluation results