ModernBERT-base trained on GooAQ

This is a Cross Encoder model finetuned from answerdotai/ModernBERT-base using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

See training_gooaq_bce.py for the training script. This script is also described in the Cross Encoder > Training Overview documentation and the Training and Finetuning Reranker Models with Sentence Transformers v4 blogpost.

image/png

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: answerdotai/ModernBERT-base
  • Maximum Sequence Length: 8192 tokens
  • Number of Output Labels: 1 label
  • Language: en
  • License: apache-2.0

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-ModernBERT-base-gooaq-bce")
# Get scores for pairs of texts
pairs = [
    ['why are rye chips so good?', "It makes them taste that much better! The rye chips are tasty because they stand out--they're the saltiest thing in the bag. It's not because rye bread is inherently awesome. ... You could just buy a bag of rye chips."],
    ['why are rye chips so good?', 'There are no substantial technical, nutritional or performance issues associated with rye that would limit its use for pets. Rye is a fairly common ingredient in human foods and beverages. The most prevalent occurrence is in crackers and breads.'],
    ['why are rye chips so good?', 'Bread made wholly from rye flour is made in Germany and called pumpernickel. Rye is unique among grains for having a high level of fibre in its endosperm – not just in its bran. As such, the glycemic index (GI) of rye products is generally lower than products made from wheat and most other grains.'],
    ['why are rye chips so good?', 'KFC Chips – The salt mix on the seasoned chips and the actual chips do not contain any animal products. Our supplier/s of chips and seasoning have confirmed they are suitable for vegans.'],
    ['why are rye chips so good?', 'A study in the American Journal of Clinical Nutrition found that eating rye leads to better blood-sugar control compared to wheat. Rye bread is packed with magnesium, which helps control blood pressure and optimize heart health. Its high levels of soluble fibre can also reduce cholesterol.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'why are rye chips so good?',
    [
        "It makes them taste that much better! The rye chips are tasty because they stand out--they're the saltiest thing in the bag. It's not because rye bread is inherently awesome. ... You could just buy a bag of rye chips.",
        'There are no substantial technical, nutritional or performance issues associated with rye that would limit its use for pets. Rye is a fairly common ingredient in human foods and beverages. The most prevalent occurrence is in crackers and breads.',
        'Bread made wholly from rye flour is made in Germany and called pumpernickel. Rye is unique among grains for having a high level of fibre in its endosperm – not just in its bran. As such, the glycemic index (GI) of rye products is generally lower than products made from wheat and most other grains.',
        'KFC Chips – The salt mix on the seasoned chips and the actual chips do not contain any animal products. Our supplier/s of chips and seasoning have confirmed they are suitable for vegans.',
        'A study in the American Journal of Clinical Nutrition found that eating rye leads to better blood-sugar control compared to wheat. Rye bread is packed with magnesium, which helps control blood pressure and optimize heart health. Its high levels of soluble fibre can also reduce cholesterol.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric Value
map 0.7308 (+0.1997)
mrr@10 0.7292 (+0.2052)
ndcg@10 0.7713 (+0.1801)

Cross Encoder Reranking

Metric Value
map 0.7908 (+0.2597)
mrr@10 0.7890 (+0.2650)
ndcg@10 0.8351 (+0.2439)

Cross Encoder Reranking

  • Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric NanoMSMARCO_R100 NanoNFCorpus_R100 NanoNQ_R100
map 0.4579 (-0.0317) 0.3414 (+0.0804) 0.3932 (-0.0264)
mrr@10 0.4479 (-0.0296) 0.5340 (+0.0342) 0.3918 (-0.0349)
ndcg@10 0.5275 (-0.0130) 0.3821 (+0.0571) 0.4630 (-0.0377)

Cross Encoder Nano BEIR

  • Dataset: NanoBEIR_R100_mean
  • Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ],
        "rerank_k": 100,
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric Value
map 0.3975 (+0.0074)
mrr@10 0.4579 (-0.0101)
ndcg@10 0.4575 (+0.0022)

Training Details

Training Dataset

Unnamed Dataset

  • Size: 578,402 training samples
  • Columns: question, answer, and label
  • Approximate statistics based on the first 1000 samples:
    question answer label
    type string string int
    details
    • min: 19 characters
    • mean: 45.14 characters
    • max: 85 characters
    • min: 65 characters
    • mean: 254.8 characters
    • max: 379 characters
    • 0: ~82.90%
    • 1: ~17.10%
  • Samples:
    question answer label
    why are rye chips so good? It makes them taste that much better! The rye chips are tasty because they stand out--they're the saltiest thing in the bag. It's not because rye bread is inherently awesome. ... You could just buy a bag of rye chips. 1
    why are rye chips so good? There are no substantial technical, nutritional or performance issues associated with rye that would limit its use for pets. Rye is a fairly common ingredient in human foods and beverages. The most prevalent occurrence is in crackers and breads. 0
    why are rye chips so good? Bread made wholly from rye flour is made in Germany and called pumpernickel. Rye is unique among grains for having a high level of fibre in its endosperm – not just in its bran. As such, the glycemic index (GI) of rye products is generally lower than products made from wheat and most other grains. 0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fct": "torch.nn.modules.linear.Identity",
        "pos_weight": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 4
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss gooaq-dev_ndcg@10 NanoMSMARCO_R100_ndcg@10 NanoNFCorpus_R100_ndcg@10 NanoNQ_R100_ndcg@10 NanoBEIR_R100_mean_ndcg@10
-1 -1 - 0.1288 (-0.4624) 0.0149 (-0.5255) 0.2278 (-0.0972) 0.0229 (-0.4777) 0.0885 (-0.3668)
0.0001 1 1.0435 - - - - -
0.0221 200 1.1924 - - - - -
0.0443 400 1.1531 - - - - -
0.0664 600 0.9371 - - - - -
0.0885 800 0.6993 - - - - -
0.1106 1000 0.669 0.7042 (+0.1130) 0.4353 (-0.1051) 0.3289 (+0.0039) 0.4250 (-0.0757) 0.3964 (-0.0590)
0.1328 1200 0.6257 - - - - -
0.1549 1400 0.6283 - - - - -
0.1770 1600 0.6014 - - - - -
0.1992 1800 0.5888 - - - - -
0.2213 2000 0.5493 0.7425 (+0.1513) 0.4947 (-0.0457) 0.3568 (+0.0318) 0.4634 (-0.0373) 0.4383 (-0.0171)
0.2434 2200 0.5479 - - - - -
0.2655 2400 0.5329 - - - - -
0.2877 2600 0.5208 - - - - -
0.3098 2800 0.5259 - - - - -
0.3319 3000 0.5221 0.7479 (+0.1567) 0.5146 (-0.0258) 0.3710 (+0.0460) 0.4846 (-0.0160) 0.4568 (+0.0014)
0.3541 3200 0.4977 - - - - -
0.3762 3400 0.4965 - - - - -
0.3983 3600 0.4985 - - - - -
0.4204 3800 0.4907 - - - - -
0.4426 4000 0.5058 0.7624 (+0.1712) 0.5166 (-0.0238) 0.3665 (+0.0415) 0.4868 (-0.0138) 0.4567 (+0.0013)
0.4647 4200 0.4885 - - - - -
0.4868 4400 0.495 - - - - -
0.5090 4600 0.4839 - - - - -
0.5311 4800 0.4983 - - - - -
0.5532 5000 0.4778 0.7603 (+0.1691) 0.5110 (-0.0294) 0.3540 (+0.0290) 0.4809 (-0.0197) 0.4487 (-0.0067)
0.5753 5200 0.4726 - - - - -
0.5975 5400 0.477 - - - - -
0.6196 5600 0.4613 - - - - -
0.6417 5800 0.4492 - - - - -
0.6639 6000 0.4506 0.7643 (+0.1731) 0.5275 (-0.0129) 0.3639 (+0.0389) 0.4913 (-0.0094) 0.4609 (+0.0055)
0.6860 6200 0.4618 - - - - -
0.7081 6400 0.463 - - - - -
0.7303 6600 0.4585 - - - - -
0.7524 6800 0.4612 - - - - -
0.7745 7000 0.4621 0.7649 (+0.1736) 0.5105 (-0.0299) 0.3688 (+0.0437) 0.4552 (-0.0454) 0.4448 (-0.0105)
0.7966 7200 0.4536 - - - - -
0.8188 7400 0.4515 - - - - -
0.8409 7600 0.4396 - - - - -
0.8630 7800 0.4542 - - - - -
0.8852 8000 0.4332 0.7669 (+0.1757) 0.5247 (-0.0157) 0.3794 (+0.0544) 0.4370 (-0.0637) 0.4470 (-0.0083)
0.9073 8200 0.447 - - - - -
0.9294 8400 0.4335 - - - - -
0.9515 8600 0.4179 - - - - -
0.9737 8800 0.4459 - - - - -
0.9958 9000 0.4196 0.7713 (+0.1801) 0.5275 (-0.0130) 0.3821 (+0.0571) 0.4630 (-0.0377) 0.4575 (+0.0022)
-1 -1 - 0.7713 (+0.1801) 0.5275 (-0.0130) 0.3821 (+0.0571) 0.4630 (-0.0377) 0.4575 (+0.0022)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.5.0.dev0
  • Transformers: 4.49.0
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.5.2
  • Datasets: 2.21.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
29
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-ranking models for sentence-transformers library.

Model tree for tomaarsen/reranker-ModernBERT-base-gooaq-bce

Finetuned
(449)
this model

Evaluation results