CrossEncoder based on microsoft/MiniLM-L12-H384-uncased
This is a Cross Encoder model finetuned from microsoft/MiniLM-L12-H384-uncased on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: microsoft/MiniLM-L12-H384-uncased
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("yjoonjang/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-plistmle-normalize-sum")
# Get scores for pairs of texts
pairs = [
['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (3,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'How many calories in an egg',
[
'There are on average between 55 and 80 calories in an egg depending on its size.',
'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
'Most of the calories in an egg come from the yellow yolk in the center.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
NanoMSMARCO_R100
,NanoNFCorpus_R100
andNanoNQ_R100
- Evaluated with
CrossEncoderRerankingEvaluator
with these parameters:{ "at_k": 10, "always_rerank_positives": true }
Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.4715 (-0.0181) | 0.3131 (+0.0521) | 0.5155 (+0.0959) |
mrr@10 | 0.4636 (-0.0139) | 0.5461 (+0.0462) | 0.5212 (+0.0945) |
ndcg@10 | 0.5387 (-0.0018) | 0.3326 (+0.0075) | 0.5741 (+0.0734) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean
- Evaluated with
CrossEncoderNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ], "rerank_k": 100, "at_k": 10, "always_rerank_positives": true }
Metric | Value |
---|---|
map | 0.4334 (+0.0433) |
mrr@10 | 0.5103 (+0.0423) |
ndcg@10 | 0.4818 (+0.0264) |
Training Details
Training Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 78,704 training samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 10 characters
- mean: 34.51 characters
- max: 113 characters
- min: 3 elements
- mean: 7.11 elements
- max: 12 elements
- min: 3 elements
- mean: 7.11 elements
- max: 12 elements
- Samples:
query docs labels what makes insulin
["Insulin is a hormone. It makes our body's cells absorb glucose from the blood. The glucose is stored in the liver and muscle as glycogen and stops the body from using fat as a source of energy. When there is very little insulin in the blood, or none at all, glucose is not taken up by most body cells. Insulin is also released when glucose is present in the blood. After eating carbohydrates, blood glucose levels rise. Insulin makes it possible for glucose to enter our body's cells-without glucose in our cells they would not be able to function. Without insulin the glucose cannot enter our", "Type 1 Diabetes Type 1 diabetes is a serious condition that occurs when the pancreas makes little or no insulin. Without insulin, the body is unable to take the glucose (blood sugar) it gets from food into cells to fuel the body. People with type 1 diabetes must take daily insulin or other medications daily. With the help of insulin, the body's cells take up the glucose and use it for energy. When ...
[1, 0, 0, 0, 0, ...]
what is the temperature in puerto plata dominican republic
['Puerto Plata: Annual Weather Averages. July is the hottest month in Puerto Plata with an average temperature of 27°C (81°F) and the coldest is January at 23°C (73°F) with the most daily sunshine hours at 8 in August. The wettest month is December with an average of 246mm of rain. ', 'The average daily temperature in Puerto Plata can reach highs of around 28 C, which can drop to 17 C. The temperature of the sea stays on average at around 26 C (79 F). There is an average rainfall of 148 mm over 11 days in Puerto Plata throughout this month. Puerto Plata sees 6 hours of sunshine a day during this month.', 'This report describes the typical weather at the Gregorio Luperon Luperón International (Airport Puerto, Plata Dominican) republic weather station over the course of an Average. December it is based on the historical records from 1997 to. 2012 earlier records are either unavailable or. unreliable Wind. The wind is most often out of the east (27% of the time). The wind is least often o...
[1, 0, 0, 0, 0, ...]
what nutrients are in guacamole
['Guacamole contains mashed avocados and seasonings, such as lime or lemon juice, garlic and cilantro. You can eat this Mexican food with tortilla chips or as a topping. Avocados contribute vitamins, minerals and healthy fats to the dish. In moderation, guacamole is a healthy addition to a balanced diet. Sodium and Potassium. Each serving of guacamole contains 10 milligrams of sodium and 452 milligrams of potassium. A high-sodium diet can lead to high blood pressure and cause congestive heart failure, kidney failure and stroke, according to MayoClinic.com.', "Calories and Macronutrients. A guacamole recipe made with four large avocados makes eight servings, each containing 150 calories and 2 grams of protein. Because guacamole is a calorie-dense food, watch your portion size if you're counting calories. A serving of guacamole contains 9 grams of total carbohydrates. Sodium and Potassium. Each serving of guacamole contains 10 milligrams of sodium and 452 milligrams of potassium. A high-...
[1, 1, 0, 0, 0, ...]
- Loss:
PListMLELoss
with these parameters:{ "lambda_weight": "sentence_transformers.cross_encoder.losses.PListMLELoss.PListMLELambdaWeight", "activation_fct": "torch.nn.modules.linear.Identity", "mini_batch_size": null, "respect_input_order": true }
Evaluation Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 1,000 evaluation samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 33.06 characters
- max: 99 characters
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- Samples:
query docs labels what currency is accepted in london
["The UK unit of currency is the pound sterling (£). In London we often call one pound (£1) a quid and sometimes a nicker. A lot of European countries have changed their currency to the Euro, but the UK has not yet joined. There is a lot of speculation about if and when we ever will join, but that's another story.", "The UK's currency is the pound sterling (£ / GBP). Despite being a member of the European Union, the UK has not adopted the euro. There are 100 pence (p) to the pound (£). Notes come in denominations of £5, £10, £20 and £50. Coins come in 1p, 2p, 5p, 10p, 20p, 50p, £1 and £2. Money Talks: Speak Like a Londoner. You will usually hear British people say pee rather than pence, as in 50p (50 pee). More colloquially, a pound is known as a quid, a five pound note is a fiver and a ten pound note a tenner.", 'Either change your dollars at the airport when you arrive or use your ATM. There are countless places (banks, bureaux de change) in London to change money, and some major sho...
[1, 0, 0, 0, 0, ...]
what are earwigs
['Earwigs are nocturnal insects commonly found in high moisture areas near human dwellings. They are omnivorous feeding on both plants and other insects living and dead. Contrary to the old wives tales, earwigs do not intentionally crawl into your ears and they do not eat your brains, their omnivory notwithstanding. ', 'With about 2,000 species in 12 families, they are one of the smaller insect orders. Earwigs have characteristic cerci, a pair of forceps pincers on their abdomen, and membranous wings folded underneath short forewings, hence the scientific order name, skin wings.. Earwigs rarely use their flying ability. Earwigs are mostly nocturnal and often hide in small, moist crevices during the day, and are active at night, feeding on a wide variety of insects and plants. Damage to foliage, flowers, and various crops is commonly blamed on earwigs, especially the common earwig Forficula auricularia', 'Earwigs, or pincher bugs, like to eat decomposing plants and wet leaves. They inva...
[1, 0, 0, 0, 0, ...]
what is the water temperature in daydream island
['Daydream Island weather consists of warm winters, sunny springs and autumns and hot humid summers. Water temperatures are a beautiful 25 degrees Celsius all year round. Maximum temperatures in Daydream Island rarely move out of the 31C to 24C range all year round. Daydream Island experiences higher rainfall averages in December through to February. This is also when the highest average temperatures are recorded. The tropical showers are typically heavy but brief, and there are usually plenty of sunshine periods during these months.', 'To get to Daydream Island Resort and Spa you travel by luxury launch transfer (ferry) or helicopter. If you are transferring by launch this is done from Port of Airlie (mainland Australia) or the Great Barrier Reef Airport (Hamilton Island).', 'What To Bring To Daydream Island! Daydream is the dream tropical island complete with the perfect tropical climate that you dream about. The island has a beautiful warm sub-tropical climate that is perfect for th...
[1, 0, 0, 0, 0, ...]
- Loss:
PListMLELoss
with these parameters:{ "lambda_weight": "sentence_transformers.cross_encoder.losses.PListMLELoss.PListMLELambdaWeight", "activation_fct": "torch.nn.modules.linear.Identity", "mini_batch_size": null, "respect_input_order": true }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | - | 0.1311 (-0.4093) | 0.2772 (-0.0479) | 0.0728 (-0.4278) | 0.1604 (-0.2950) |
0.0002 | 1 | 2.1843 | - | - | - | - | - |
0.0508 | 250 | 2.0934 | - | - | - | - | - |
0.1016 | 500 | 1.9667 | 1.9470 | 0.0856 (-0.4548) | 0.2062 (-0.1189) | 0.1352 (-0.3655) | 0.1423 (-0.3130) |
0.1525 | 750 | 1.927 | - | - | - | - | - |
0.2033 | 1000 | 1.8803 | 1.8826 | 0.4032 (-0.1372) | 0.2588 (-0.0662) | 0.4783 (-0.0223) | 0.3801 (-0.0753) |
0.2541 | 1250 | 1.8766 | - | - | - | - | - |
0.3049 | 1500 | 1.8778 | 1.8625 | 0.4667 (-0.0738) | 0.2987 (-0.0263) | 0.5095 (+0.0089) | 0.4250 (-0.0304) |
0.3558 | 1750 | 1.866 | - | - | - | - | - |
0.4066 | 2000 | 1.8586 | 1.8422 | 0.5211 (-0.0193) | 0.3072 (-0.0178) | 0.5527 (+0.0521) | 0.4604 (+0.0050) |
0.4574 | 2250 | 1.8588 | - | - | - | - | - |
0.5082 | 2500 | 1.845 | 1.8368 | 0.5387 (-0.0018) | 0.3326 (+0.0075) | 0.5741 (+0.0734) | 0.4818 (+0.0264) |
0.5591 | 2750 | 1.8499 | - | - | - | - | - |
0.6099 | 3000 | 1.8396 | 1.8326 | 0.5161 (-0.0243) | 0.3296 (+0.0046) | 0.5773 (+0.0766) | 0.4743 (+0.0190) |
0.6607 | 3250 | 1.8373 | - | - | - | - | - |
0.7115 | 3500 | 1.8372 | 1.8296 | 0.5154 (-0.0250) | 0.3109 (-0.0141) | 0.5724 (+0.0717) | 0.4662 (+0.0109) |
0.7624 | 3750 | 1.8405 | - | - | - | - | - |
0.8132 | 4000 | 1.8304 | 1.8294 | 0.5389 (-0.0015) | 0.3155 (-0.0095) | 0.5748 (+0.0741) | 0.4764 (+0.0210) |
0.8640 | 4250 | 1.8292 | - | - | - | - | - |
0.9148 | 4500 | 1.8268 | 1.8217 | 0.5298 (-0.0106) | 0.3097 (-0.0154) | 0.5653 (+0.0647) | 0.4683 (+0.0129) |
0.9656 | 4750 | 1.8273 | - | - | - | - | - |
-1 | -1 | - | - | 0.5387 (-0.0018) | 0.3326 (+0.0075) | 0.5741 (+0.0734) | 0.4818 (+0.0264) |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.4.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
PListMLELoss
@inproceedings{lan2014position,
title={Position-Aware ListMLE: A Sequential Learning Process for Ranking.},
author={Lan, Yanyan and Zhu, Yadong and Guo, Jiafeng and Niu, Shuzi and Cheng, Xueqi},
booktitle={UAI},
volume={14},
pages={449--458},
year={2014}
}
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for yjoonjang/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-plistmle-normalize-sum
Base model
microsoft/MiniLM-L12-H384-uncasedDataset used to train yjoonjang/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-plistmle-normalize-sum
Evaluation results
- Map on NanoMSMARCO R100self-reported0.471
- Mrr@10 on NanoMSMARCO R100self-reported0.464
- Ndcg@10 on NanoMSMARCO R100self-reported0.539
- Map on NanoNFCorpus R100self-reported0.313
- Mrr@10 on NanoNFCorpus R100self-reported0.546
- Ndcg@10 on NanoNFCorpus R100self-reported0.333
- Map on NanoNQ R100self-reported0.515
- Mrr@10 on NanoNQ R100self-reported0.521
- Ndcg@10 on NanoNQ R100self-reported0.574
- Map on NanoBEIR R100 meanself-reported0.433