CrossEncoder based on microsoft/MiniLM-L12-H384-uncased
This is a Cross Encoder model finetuned from microsoft/MiniLM-L12-H384-uncased on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: microsoft/MiniLM-L12-H384-uncased
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-listnet")
# Get scores for pairs of texts
pairs = [
['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (3,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'How many calories in an egg',
[
'There are on average between 55 and 80 calories in an egg depending on its size.',
'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
'Most of the calories in an egg come from the yellow yolk in the center.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
NanoMSMARCO
,NanoNFCorpus
andNanoNQ
- Evaluated with
CERerankingEvaluator
Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ |
---|---|---|---|
map | 0.5020 (+0.0124) | 0.3389 (+0.0684) | 0.5833 (+0.1626) |
mrr@10 | 0.4884 (+0.0109) | 0.5581 (+0.0582) | 0.5848 (+0.1581) |
ndcg@10 | 0.5545 (+0.0141) | 0.3595 (+0.0345) | 0.6487 (+0.1481) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_mean
- Evaluated with
CENanoBEIREvaluator
Metric | Value |
---|---|
map | 0.4747 (+0.0812) |
mrr@10 | 0.5437 (+0.0757) |
ndcg@10 | 0.5209 (+0.0655) |
Training Details
Training Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 82,326 training samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 33.24 characters
- max: 101 characters
- size: 10 elements
- size: 10 elements
- Samples:
query docs labels what are fiber lasers
['From Wikipedia, the free encyclopedia. A fiber laser or fibre laser is a laser in which the active gain medium is an optical fiber doped with rare-earth elements such as erbium, ytterbium, neodymium, dysprosium, praseodymium, and thulium. They are related to doped fiber amplifiers, which provide light amplification without lasing. Many high-power fiber lasers are based on double-clad fiber. The gain medium forms the core of the fiber, which is surrounded by two layers of cladding. The lasing mode propagates in the core, while a multimode pump beam propagates in the inner cladding layer. The outer cladding keeps this pump light confined.', 'The fiber laser is a variation on the standard solid-state laser, with the medium being a clad fiber rather than a rod, a slab, or a disk. Laser light is emitted by a dopant in the central core of the fiber, and the core structure can range from simple to fairly complex. The doped fiber has a cavity mirror on each end; in practice, these are fiber ...
[1, 0, 0, 0, 0, ...]
fast can boar run
['A wild boar can run at speeds of 30-35mph which is about 48.3-56.3km/h. As for weight, a wild boar weighs around 52-91kg which is about 115-200 pounds. Wild boars are native to Europe, Africa, and some parts of Asia. The body of a wild boar is around 0.8-2 meters long which is about 2.6-6.6 feet long.', 'Wild Turkeys can run at speeds up to 25 mph, and they can fly up to 55 mph. However, if being hunted by someone for the Thanksgiving or Christmas table-Who know how fast the … y will run or fly!', 'A wild hog can reach speeds of up to 35 mph when running at full speed. A hippo can run over 30 mph! report this answer. Updated on Wednesday, February 01 2012 at 03:09PM EST. Source: www.texasboars.com/...', "Les. Brown bears-are extremely fast, capable of running in short bursts as high as of 40 mph (64 km/h). Polar bears-have been clocked at a top speed of 35 mph (56 km/h), along a a road in Churchill, Canada. Grizzly bears-can reach top speeds of up to 30 mph (48km/h), but they can't m...
[1, 0, 0, 0, 0, ...]
what plant would grow in shade
['Hostas are among the showiest and easy-to-grow perennial plants that grow in shade. They also offer the most variety of any of the multiple shade plants. Choose from miniatures that stay only a couple of inches wide or giants that sprawl 6 feet across or more. Japanese forestgrass (Hakonechloa macra) is a wonderful grass for plants that grow in shade. It offers a lovely waterfall-like habit and variegated varieties have bight gold, yellow, or white in the foliage.', 'Lilyturf (Liriope) is an easy-to-grow favorite shade plant. Loved for its grassy foliage and spikes of blue or white flowers in late summer, as well as its resistance to deer and rabbits, lilyturf is practically a plant-it-and-forget garden resident. It grows best in Zones 5-10 and grows a foot tall. Japanese forestgrass (Hakonechloa macra) is a wonderful grass for plants that grow in shade. It offers a lovely waterfall-like habit and variegated varieties have bight gold, yellow, or white in the foliage.', "Gardening in ...
[1, 1, 0, 0, 0, ...]
- Loss:
ListNetLoss
with these parameters:{ "eps": 1e-10, "pad_value": -1 }
Evaluation Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 82,326 evaluation samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 33.6 characters
- max: 97 characters
- size: 10 elements
- size: 10 elements
- Samples:
query docs labels can blue cheese cause mold allergic reaction
['Mold Allergy. The blue spots found in blue cheese are mold. If you’ve been diagnosed with a mold allergy, eating blue cheese can trigger common mold allergic reaction symptoms. Mold allergies commonly arise from airborne spores during the spring, summer and fall months. Inhaled mold spores cause inflammation in the eyes, throat and sinuses. If eating blue cheese causes inflammation to develop anywhere in your body, make an appointment with your doctor because you may have an allergy to one or more of its ingredients. Blue cheese contains two highly allergenic substances: milk and mold. Most symptoms caused by an allergic reaction are the result of inflammation in soft tissue in different parts of the body. Your doctor may recommend allergy testing to determine the cause of the inflammation', 'Blue cheese allergy is a condition that has puzzled food experts quite a bit. The unique gourmet cheese with a mottled appearance can cause your body to swell up making you feel extremely uncomf...
[1, 0, 0, 0, 0, ...]
what does it cost for a facebook ad
['Contributed by Jason Alleger. The cost of Facebook ads depends on a few factors, but generally ranges from $.05 – $5 per click. Facebook increases the cost of ads based on (a) targeting, (b) bids and (c) engagement. The more targeted your ads are, the more expensive they become. If you were to target ads to all Facebook users (all 1.06 billion), then you would pay just pennies. Sponsored Stories: 400 clicks to Facebook page – $200 ($.50 per click). Promoted Posts: 20,000 views – $100 ($5 per 1,000 views). It takes a lot of work to keep the cost-per-click down, as the advertiser needs to constantly be updating their ads to keep the cost low.', 'Can anyone who has advertised on facebook describe how much it cost you overall? Also, is there anyone who can mention if facebook advertising (and the specific type of facebook ad-social ad/etc, age group) was positive or negative for them in their ventures? Best Answer: Setting up an ad account and advertising on Facebook is easy. You can do ...
[1, 0, 0, 0, 0, ...]
how can ants get in dishwasher
["Full Answer. Ants usually find their way into a dishwasher through the dryer vents or the drain. Although most people's first reaction is to turn to pesticides to solve the problem, the chemicals contained in pesticides can be harmful for children and pets.", "No ants in the house. I've used traps on both sides of dishwasher and under the sink where the drain and supply holes are. We have put vinegar in the dishwasher drain & have let it sit there for three days and the ants still come back. They are only in side the dishwasher never on the counter ,floor, sink.", '1 Then leave them alone for a number of weeks. 2 Exterior: Sprinkle granular ant bait around ant hills, along ant trails; again, anywhere they appear. 3 Pets will not be injured by these baits. 4 The ants quickly take the bait below ground to the queen, destroying the colony.', "A: Empty the dishwasher completely, and pour 1 gallon of vinegar down the dishwasher's drain. Leave this for a few minutes so any ants appearin...
[1, 0, 0, 0, 0, ...]
- Loss:
ListNetLoss
with these parameters:{ "eps": 1e-10, "pad_value": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepslearning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_ndcg@10 | NanoNFCorpus_ndcg@10 | NanoNQ_ndcg@10 | NanoBEIR_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | - | 0.0444 (-0.4960) | 0.2663 (-0.0587) | 0.0478 (-0.4528) | 0.1195 (-0.3359) |
0.0001 | 1 | 2.0806 | - | - | - | - | - |
0.0230 | 200 | 2.0875 | - | - | - | - | - |
0.0459 | 400 | 2.097 | - | - | - | - | - |
0.0689 | 600 | 2.0844 | - | - | - | - | - |
0.0918 | 800 | 2.0771 | - | - | - | - | - |
0.1148 | 1000 | 2.0699 | - | - | - | - | - |
0.1377 | 1200 | 2.0864 | - | - | - | - | - |
0.1607 | 1400 | 2.0676 | - | - | - | - | - |
0.1836 | 1600 | 2.0772 | 2.0761 | 0.5280 (-0.0125) | 0.3529 (+0.0279) | 0.5989 (+0.0983) | 0.4933 (+0.0379) |
0.2066 | 1800 | 2.0822 | - | - | - | - | - |
0.2295 | 2000 | 2.0777 | - | - | - | - | - |
0.2525 | 2200 | 2.075 | - | - | - | - | - |
0.2755 | 2400 | 2.0717 | - | - | - | - | - |
0.2984 | 2600 | 2.0854 | - | - | - | - | - |
0.3214 | 2800 | 2.0765 | - | - | - | - | - |
0.3443 | 3000 | 2.0678 | - | - | - | - | - |
0.3673 | 3200 | 2.076 | 2.0741 | 0.5368 (-0.0037) | 0.3781 (+0.0531) | 0.5847 (+0.0841) | 0.4999 (+0.0445) |
0.3902 | 3400 | 2.0749 | - | - | - | - | - |
0.4132 | 3600 | 2.0735 | - | - | - | - | - |
0.4361 | 3800 | 2.0636 | - | - | - | - | - |
0.4591 | 4000 | 2.0749 | - | - | - | - | - |
0.4820 | 4200 | 2.0745 | - | - | - | - | - |
0.5050 | 4400 | 2.0716 | - | - | - | - | - |
0.5279 | 4600 | 2.0741 | - | - | - | - | - |
0.5509 | 4800 | 2.0724 | 2.0735 | 0.5633 (+0.0229) | 0.3703 (+0.0453) | 0.6102 (+0.1095) | 0.5146 (+0.0592) |
0.5739 | 5000 | 2.0788 | - | - | - | - | - |
0.5968 | 5200 | 2.0711 | - | - | - | - | - |
0.6198 | 5400 | 2.0708 | - | - | - | - | - |
0.6427 | 5600 | 2.0645 | - | - | - | - | - |
0.6657 | 5800 | 2.0684 | - | - | - | - | - |
0.6886 | 6000 | 2.0731 | - | - | - | - | - |
0.7116 | 6200 | 2.0745 | - | - | - | - | - |
0.7345 | 6400 | 2.067 | 2.0722 | 0.5510 (+0.0105) | 0.3441 (+0.0190) | 0.5927 (+0.0921) | 0.4959 (+0.0405) |
0.7575 | 6600 | 2.0657 | - | - | - | - | - |
0.7804 | 6800 | 2.0798 | - | - | - | - | - |
0.8034 | 7000 | 2.0693 | - | - | - | - | - |
0.8264 | 7200 | 2.074 | - | - | - | - | - |
0.8493 | 7400 | 2.0744 | - | - | - | - | - |
0.8723 | 7600 | 2.0688 | - | - | - | - | - |
0.8952 | 7800 | 2.0515 | - | - | - | - | - |
0.9182 | 8000 | 2.0765 | 2.0723 | 0.5545 (+0.0141) | 0.3595 (+0.0345) | 0.6487 (+0.1481) | 0.5209 (+0.0655) |
0.9411 | 8200 | 2.0777 | - | - | - | - | - |
0.9641 | 8400 | 2.073 | - | - | - | - | - |
0.9870 | 8600 | 2.0726 | - | - | - | - | - |
-1 | -1 | - | - | 0.5545 (+0.0141) | 0.3595 (+0.0345) | 0.6487 (+0.1481) | 0.5209 (+0.0655) |
- The bold row denotes the saved checkpoint.
Environmental Impact
Carbon emissions were measured using CodeCarbon.
- Energy Consumed: 0.236 kWh
- Carbon Emitted: 0.092 kg of CO2
- Hours Used: 0.862 hours
Training Hardware
- On Cloud: No
- GPU Model: 1 x NVIDIA GeForce RTX 3090
- CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
- RAM Size: 31.78 GB
Framework Versions
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.48.3
- PyTorch: 2.5.0+cu121
- Accelerate: 1.3.0
- Datasets: 2.20.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ListNetLoss
@inproceedings{cao2007learning,
title={Learning to rank: from pairwise approach to listwise approach},
author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
booktitle={Proceedings of the 24th international conference on Machine learning},
pages={129--136},
year={2007}
}
- Downloads last month
- 15
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support text-classification models for sentence-transformers library.
Model tree for tomaarsen/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-listnet
Base model
microsoft/MiniLM-L12-H384-uncased