CrossEncoder based on jhu-clsp/ettin-encoder-1b

This is a Cross Encoder model finetuned from jhu-clsp/ettin-encoder-1b on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Type: Cross Encoder
Base model: jhu-clsp/ettin-encoder-1b
Maximum Sequence Length: 7999 tokens
Number of Output Labels: 1 label
Training Dataset:
- ms_marco
Language: en

Model Sources

Documentation: Sentence Transformers Documentation
Documentation: Cross Encoder Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Cross Encoders on Hugging Face

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("kdhole/reranker-msmarco-v1.1-ettin-encoder-1b-listnet")
# Get scores for pairs of texts
pairs = [
    ['how do you measure a horse in hands', '1 A hand is equal to 4 inches or 10.2cms. 2  You should measure your horse from the point of the withers to the ground. 3  A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4  This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.'],
    ['how do you measure a horse in hands', '1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2  One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3  For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4  The result is 17 hands with 3 inches (7.6 cm) left over.'],
    ['how do you measure a horse in hands', 'Record the measurement. 1  If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2  If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3  One hand equals 4 inches (10.2 cm), so divide the measurement by 4.'],
    ['how do you measure a horse in hands', 'After you have measured your horse you will need to convert the results from inches to hands.. Horse height is correctly referred to by a unit of measurement known as a hand.. One hand is equal to four inches. The gray mare in the photo above is 58 inches from the ground to the top of her withers. When 58 is divided by 4, you have 14.5.'],
    ['how do you measure a horse in hands', '1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2  If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3  One hand equals 4 inches (10.2 cm), so divide the measurement by 4.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'how do you measure a horse in hands',
    [
        '1 A hand is equal to 4 inches or 10.2cms. 2  You should measure your horse from the point of the withers to the ground. 3  A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4  This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.',
        '1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2  One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3  For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4  The result is 17 hands with 3 inches (7.6 cm) left over.',
        'Record the measurement. 1  If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2  If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3  One hand equals 4 inches (10.2 cm), so divide the measurement by 4.',
        'After you have measured your horse you will need to convert the results from inches to hands.. Horse height is correctly referred to by a unit of measurement known as a hand.. One hand is equal to four inches. The gray mare in the photo above is 58 inches from the ground to the top of her withers. When 58 is divided by 4, you have 14.5.',
        '1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2  If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3  One hand equals 4 inches (10.2 cm), so divide the measurement by 4.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100

Evaluated with CrossEncoderRerankingEvaluator with these parameters:

{
    "at_k": 10,
    "always_rerank_positives": true
}

Metric	NanoMSMARCO_R100	NanoNFCorpus_R100	NanoNQ_R100
map	0.5989 (+0.1094)	0.3535 (+0.0925)	0.6692 (+0.2496)
mrr@10	0.5889 (+0.1114)	0.5271 (+0.0272)	0.6896 (+0.2629)
ndcg@10	0.6445 (+0.1041)	0.3808 (+0.0558)	0.7157 (+0.2151)

Cross Encoder Nano BEIR

Dataset: NanoBEIR_R100_mean

Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:

{
    "dataset_names": [
        "msmarco",
        "nfcorpus",
        "nq"
    ],
    "rerank_k": 100,
    "at_k": 10,
    "always_rerank_positives": true
}

Metric	Value
map	0.5405 (+0.1505)
mrr@10	0.6018 (+0.1338)
ndcg@10	0.5804 (+0.1250)

Training Details

Training Dataset

ms_marco

Dataset: ms_marco at a47ee7a
Size: 78,704 training samples
Columns: query, docs, and labels

Approximate statistics based on the first 1000 samples:

	query	docs	labels
type	string	list	list
details	min: 11 characters mean: 33.93 characters max: 109 characters	min: 3 elements mean: 6.50 elements max: 10 elements	min: 3 elements mean: 6.50 elements max: 10 elements

Samples:

query	docs	labels
`Hemophilia is a group of different inherited blood-clotting disorders. Which is true about hemophilia`	['Hemophilia is a hereditary bleeding disorder caused by a deficiency in one of two blood clotting factors: factor VIII or factor IX. Several different gene abnormalities can cause the disorder. People bleed unexpectedly or after minor injuries. ', 'Hemophilia is an inherited bleeding disorder that almost always affects males. A person with hemophilia has low or non-existent levels of blood clotting protein called factor. Coagulation factor is necessary for the clotting mechanism in our bodies to work. There are 13 blood clotting proteins (coagulation factor) along with platelets and fibrin necessary for clotting blood. Factor IX deficiency usually only manifests in males. Hemophilia C: This person has low levels of or is missing completely factor 11 (Also called FXI or factor XI deficiency) Hemophilia C is 10 times rarer than type A. Factor XI deficiency is different because it can show up in both males and females.', 'Hemophilia is a rare hereditary (inherited) bleeding disorder in w...	`[1, 0, 0, 0, 0, ...]`
`what is the meaning of nazia`	['Show similar names Show variant names. Name Nazia generally means Princess or Queen, is of Indian origin, Name Nazia is a Feminine (or Girl) name. Person with name Nazia are mainly Muslim by religion. Name Nazia belongs to rashi Vrushik (Scorpio) with dominant planet Mars (Mangal) ', "Nazia's are very outgoing once you get to meet her,she's also a undercover freak so you gotta watch her. Nazia's are unique you can tell by the name, she yurn for attention and always wants to be in a relationship. Nazia's never like to be alone they love to be around people. They are loyal so once you meet one keep them. Nazia's are good friends once you proove to them your not fake. When you meet Nazia, You'll Love her. A beautiful girl! The name means 'Pride' so she is hardworking to bring that status to her family. All Nazia's are fantastic and they don't open up easily so you will have to give them some time.", '(viewable to Premium Members only). Below is a brief analysis of the first name only. F...	`[1, 0, 0, 0, 0, ...]`
`how injection moulding temperature affects polystyrene`	['But melt temperature also has an influence on the final molecular weight of the polymer in the moulded part[3,4]. Keywords: Polymer nanocomposites, nano kaolin clay, injection moulding, moulding temperature. influence on the behaviour of the polymer are the 1. Material is fed into a heated barrel, mixed, and forced into a mould cavity where it cools and hardens to the configuration of the cavity[13]. In injection moulding, moulding conditions have a significant influence on the final properties of the material regardless of the part design.', 'It is very easy to forget that plastic melts are not thermally stable over long periods at, or above, melt temperature. Equally, it is as easy to forget that the molten mass is not impervious to the effects of shear. Plastic Melts are not Newtonian in their behaviour. That is they do not react in a linear fashion when exposed to shearing of the melt or changes in temperature. A Newtonian melt would show a straight line graph when plotted for sh...	`[1, 0, 0, 0, 0, ...]`

Loss: ListNetLoss with these parameters:

{
    "activation_fn": "torch.nn.modules.linear.Identity",
    "mini_batch_size": 16
}

Evaluation Dataset

ms_marco

Dataset: ms_marco at a47ee7a
Size: 1,000 evaluation samples
Columns: query, docs, and labels

Approximate statistics based on the first 1000 samples:

	query	docs	labels
type	string	list	list
details	min: 11 characters mean: 34.24 characters max: 101 characters	min: 3 elements mean: 6.50 elements max: 10 elements	min: 3 elements mean: 6.50 elements max: 10 elements

Samples:

query	docs	labels
`how do you measure a horse in hands`	['1 A hand is equal to 4 inches or 10.2cms. 2 You should measure your horse from the point of the withers to the ground. 3 A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4 This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.', '1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2 One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3 For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4 The result is 17 hands with 3 inches (7.6 cm) left over.', 'Record the measurement. 1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.', 'After you have measured your horse you wi...	`[1, 0, 0, 0, 0, ...]`
`where is amsterdam located`	["Amsterdam is located in the western Netherlands, in the province of North Holland. The river Amstel terminates in the city centre and connects to a large number of canals that eventually terminate in the IJ. Amsterdam is situated 2 metres below sea level. The surrounding land is flat as it is formed of large polders. Amsterdam's main attractions, including its historic canals, the Rijksmuseum, the Van Gogh Museum, Stedelijk Museum, Hermitage Amsterdam, Anne Frank House, Amsterdam Museum, its red-light district, and its many cannabis coffee shops draw more than 5 million international visitors annually.", 'The Netherlands is bordered by Belgium in the South, Germany in the East and the Northsea in the North and West. Amsterdam is located in the South of the province of North Holland: Amsterdam Facts. 1 Amsterdam is the largest city in the Netherlands. 2 Amsterdam is the capital of the Netherlands (while The Hague is the seat of government). 3 Amsterdam is the financial and cultural...	`[1, 0, 0, 0, 0, ...]`
`what does affected mean`	['Effected means executed, produced, or brought about. For example, The dictatorial regime quickly effected changes to the constitution that restricted the freedom of the people. On the other hand, affected means made an impact on. It is the past tense of the verb form of affect, which means to impact.', 'Meaning of Affect and Effect. In order to understand the correct situation in which to use the word affect or effect, the first thing one must do is have a clear understanding of what each word means. 1 Affect is a verb. 2 It means to produce a change in or influence something. 3 Effect is a noun that can also be used as a verb.', "affect 2 is not used as a noun; as a verb it means “to pretend” or “to assume” (new students affecting a nonchalance they didn't feel). The verb effect means “to bring about, accomplish”: Her administration effected radical changes. The noun effect means “result, consequence”: the serious effects of the oil spill.", 'Affect means to have an influence on something. Affect is normally a verb. Effect is the result of an influence or change. Effect is normally a noun. They are related in t … hat when something affects something else, it produces an effect on it. The word affect has a noun meaning related to psychology and emotion. The word effect has a verb meaning, which is to create, bring about, or institute.', 'In order to understand the correct situation in which to use the word affect or effect, the first thing one must do is have a clear understanding of what each word means. 1 Affect is a verb. 2 It means to produce a change in or influence something. 3 Effect is a noun that can also be used as a verb.']	`[1, 0, 0, 0, 0]`

Loss: ListNetLoss with these parameters:

{
    "activation_fn": "torch.nn.modules.linear.Identity",
    "mini_batch_size": 16
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
learning_rate: 2e-05
num_train_epochs: 1
seed: 12
bf16: True
load_best_model_at_end: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 12
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	Validation Loss	NanoMSMARCO_R100_ndcg@10	NanoNFCorpus_R100_ndcg@10	NanoNQ_R100_ndcg@10	NanoBEIR_R100_mean_ndcg@10
-1	-1	-	-	0.0000 (-0.5404)	0.2648 (-0.0602)	0.0388 (-0.4618)	0.1012 (-0.3541)
0.0002	1	2.3028	-	-	-	-	-
0.0203	100	2.0955	2.0679	0.3022 (-0.2382)	0.2808 (-0.0442)	0.4762 (-0.0244)	0.3531 (-0.1023)
0.0407	200	2.0633	2.0643	0.5733 (+0.0329)	0.3362 (+0.0112)	0.6797 (+0.1790)	0.5297 (+0.0743)
0.0610	300	2.0738	2.0616	0.5738 (+0.0334)	0.3480 (+0.0230)	0.6018 (+0.1011)	0.5079 (+0.0525)
0.0813	400	2.0679	2.0617	0.5441 (+0.0036)	0.3162 (-0.0088)	0.6688 (+0.1681)	0.5097 (+0.0543)
0.1016	500	2.0702	2.0619	0.5566 (+0.0161)	0.3423 (+0.0172)	0.6932 (+0.1925)	0.5307 (+0.0753)
0.1220	600	2.0719	2.0602	0.5583 (+0.0179)	0.3643 (+0.0392)	0.7066 (+0.2060)	0.5431 (+0.0877)
0.1423	700	2.066	2.0600	0.5792 (+0.0388)	0.3470 (+0.0219)	0.6971 (+0.1965)	0.5411 (+0.0857)
0.1626	800	2.0704	2.0595	0.5980 (+0.0576)	0.3493 (+0.0243)	0.6749 (+0.1743)	0.5407 (+0.0854)
0.1830	900	2.0804	2.0596	0.6080 (+0.0675)	0.3557 (+0.0307)	0.6314 (+0.1307)	0.5317 (+0.0763)
0.2033	1000	2.0697	2.0590	0.5992 (+0.0587)	0.3262 (+0.0012)	0.7125 (+0.2119)	0.5460 (+0.0906)
0.2236	1100	2.0756	2.0597	0.6133 (+0.0729)	0.3890 (+0.0639)	0.6932 (+0.1926)	0.5652 (+0.1098)
0.2440	1200	2.0761	2.0592	0.5937 (+0.0533)	0.3614 (+0.0363)	0.6783 (+0.1776)	0.5445 (+0.0891)
0.2643	1300	2.0688	2.0587	0.5865 (+0.0461)	0.3562 (+0.0312)	0.6863 (+0.1856)	0.5430 (+0.0876)
0.2846	1400	2.0622	2.0588	0.6190 (+0.0786)	0.3610 (+0.0360)	0.6717 (+0.1710)	0.5506 (+0.0952)
0.3049	1500	2.0674	2.0589	0.6331 (+0.0926)	0.3719 (+0.0469)	0.7195 (+0.2189)	0.5748 (+0.1195)
0.3253	1600	2.0731	2.0590	0.6194 (+0.0790)	0.3777 (+0.0527)	0.6719 (+0.1713)	0.5564 (+0.1010)
0.3456	1700	2.0607	2.0589	0.5792 (+0.0388)	0.3991 (+0.0740)	0.6850 (+0.1843)	0.5544 (+0.0991)
0.3659	1800	2.0716	2.0593	0.6400 (+0.0996)	0.3810 (+0.0560)	0.7093 (+0.2087)	0.5768 (+0.1214)
0.3863	1900	2.065	2.0587	0.6490 (+0.1086)	0.3732 (+0.0481)	0.6862 (+0.1855)	0.5694 (+0.1141)
0.4066	2000	2.0716	2.0588	0.6336 (+0.0932)	0.3676 (+0.0426)	0.7023 (+0.2016)	0.5678 (+0.1125)
0.4269	2100	2.0755	2.0592	0.6227 (+0.0823)	0.3789 (+0.0539)	0.6523 (+0.1517)	0.5513 (+0.0959)
0.4472	2200	2.0621	2.0587	0.6296 (+0.0892)	0.3543 (+0.0292)	0.6721 (+0.1714)	0.5520 (+0.0966)
0.4676	2300	2.0733	2.0587	0.6452 (+0.1048)	0.3677 (+0.0427)	0.6939 (+0.1932)	0.5689 (+0.1136)
0.4879	2400	2.0735	2.0581	0.6360 (+0.0956)	0.3491 (+0.0240)	0.6830 (+0.1824)	0.5560 (+0.1007)
0.5082	2500	2.0681	2.0582	0.6328 (+0.0924)	0.3443 (+0.0193)	0.6792 (+0.1785)	0.5521 (+0.0967)
0.5286	2600	2.0741	2.0582	0.6618 (+0.1214)	0.3536 (+0.0286)	0.6812 (+0.1806)	0.5655 (+0.1102)
0.5489	2700	2.067	2.0587	0.6611 (+0.1207)	0.3726 (+0.0476)	0.6826 (+0.1819)	0.5721 (+0.1167)
0.5692	2800	2.0706	2.0579	0.6627 (+0.1223)	0.3736 (+0.0486)	0.6843 (+0.1836)	0.5735 (+0.1182)
0.5896	2900	2.0632	2.0580	0.6426 (+0.1022)	0.3788 (+0.0538)	0.6940 (+0.1933)	0.5718 (+0.1164)
0.6099	3000	2.0773	2.0582	0.6445 (+0.1041)	0.3808 (+0.0558)	0.7157 (+0.2151)	0.5804 (+0.1250)
0.6302	3100	2.071	2.0583	0.6354 (+0.0950)	0.3810 (+0.0559)	0.6792 (+0.1785)	0.5652 (+0.1098)
0.6505	3200	2.0678	2.0579	0.6224 (+0.0820)	0.3753 (+0.0502)	0.6622 (+0.1615)	0.5533 (+0.0979)
0.6709	3300	2.066	2.0577	0.6658 (+0.1254)	0.3761 (+0.0510)	0.6742 (+0.1735)	0.5720 (+0.1166)
0.6912	3400	2.065	2.0577	0.6525 (+0.1121)	0.3750 (+0.0500)	0.6760 (+0.1754)	0.5678 (+0.1125)
0.7115	3500	2.072	2.0580	0.6296 (+0.0892)	0.3553 (+0.0303)	0.6632 (+0.1625)	0.5494 (+0.0940)
0.7319	3600	2.065	2.0580	0.6223 (+0.0818)	0.3638 (+0.0387)	0.6762 (+0.1756)	0.5541 (+0.0987)
0.7522	3700	2.0633	2.0574	0.6400 (+0.0996)	0.3718 (+0.0468)	0.6643 (+0.1637)	0.5587 (+0.1034)
0.7725	3800	2.0655	2.0576	0.6476 (+0.1072)	0.3882 (+0.0632)	0.7001 (+0.1994)	0.5786 (+0.1233)
0.7928	3900	2.0703	2.0572	0.6385 (+0.0981)	0.3848 (+0.0597)	0.6705 (+0.1698)	0.5646 (+0.1092)
0.8132	4000	2.0741	2.0572	0.6266 (+0.0862)	0.3614 (+0.0364)	0.6759 (+0.1752)	0.5546 (+0.0993)
0.8335	4100	2.058	2.0574	0.6330 (+0.0925)	0.3750 (+0.0500)	0.6600 (+0.1593)	0.5560 (+0.1006)
0.8538	4200	2.0758	2.0574	0.6450 (+0.1046)	0.3774 (+0.0524)	0.6796 (+0.1789)	0.5673 (+0.1120)
0.8742	4300	2.0648	2.0572	0.6261 (+0.0857)	0.3681 (+0.0430)	0.6796 (+0.1789)	0.5579 (+0.1025)
0.8945	4400	2.0647	2.0573	0.6377 (+0.0973)	0.3724 (+0.0473)	0.6523 (+0.1517)	0.5541 (+0.0988)
0.9148	4500	2.0634	2.0570	0.6412 (+0.1008)	0.3738 (+0.0488)	0.6917 (+0.1911)	0.5689 (+0.1136)
0.9351	4600	2.0675	2.0570	0.6426 (+0.1022)	0.3819 (+0.0569)	0.6875 (+0.1869)	0.5707 (+0.1153)
0.9555	4700	2.061	2.0570	0.6428 (+0.1024)	0.3884 (+0.0634)	0.6929 (+0.1923)	0.5747 (+0.1194)
0.9758	4800	2.0652	2.0571	0.6462 (+0.1058)	0.3892 (+0.0641)	0.6933 (+0.1927)	0.5763 (+0.1209)
0.9961	4900	2.0636	2.0571	0.6489 (+0.1084)	0.3896 (+0.0645)	0.6889 (+0.1883)	0.5758 (+0.1204)
-1	-1	-	-	0.6445 (+0.1041)	0.3808 (+0.0558)	0.7157 (+0.2151)	0.5804 (+0.1250)

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.9.18
Sentence Transformers: 5.1.1
Transformers: 4.56.2
PyTorch: 2.8.0+cu128
Accelerate: 1.10.1
Datasets: 4.1.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ListNetLoss

@inproceedings{cao2007learning,
    title={Learning to Rank: From Pairwise Approach to Listwise Approach},
    author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
    booktitle={Proceedings of the 24th international conference on Machine learning},
    pages={129--136},
    year={2007}
}