metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:29911
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-m-v1.5
widget:
- source_sentence: >-
What strategies can be implemented to effectively leverage private
financing opportunities for small and medium-sized enterprises (SMEs)?
sentences:
- >-
(13) While the energy savings potential remains large in all sectors,
there is a particular challenge relating to transport, as it is
responsible for more than 30 % of final energy consumption, and to
buildings, since 75 % of the Union’s building stock has a poor energy
performance. Another increasingly important sector is the information
and communications technology (ICT) sector, which is responsible for 5
to 9 % of the world’s total electricity use and more than 2 % of global
emissions. In 2018, data centres accounted for 2,7 % of the electricity
demand in the EU-28. In that context, the Commission, in its
communication of 19 February 2020 on ‘Shaping Europe's digital future’
(the ‘Union’s Digital Strategy’), highlighted the need for highly
energy-efficient and sustainable data centres and transparency measures
for telecoms operators as regards their environmental footprint.
Furthermore, the possible increase in industry’s energy demand that may
result from its decarbonisation, particularly for energy intensive
processes, should also be taken into account.
- SMEs in order to leverage and trigger private financing for SMEs.
- >-
►M5 — ◄ K Gases (petroleum), refinery; Refinery gas (A complex
combination obtained from various petroleum refining operations. It
consists of hydrogen and hydrocarbons having carbon numbers
predominantly in the range of C1 through C3.) 649-153-00-0 272-338-9
68814-67-5 ►M5 — ◄ K Gases (petroleum), platformer products separator
off; Refinery gas (A complex combination obtained from the chemical
reforming of naphthenes to aromatics. It consists of hydrogen and
saturated aliphatic hydrocarbons having carbon numbers predominantly in
the range of C2 through C4.) 649-154-00-6 272-343-6 68814-90-4 ►M5 — ◄
K Gases (petroleum), hydrotreated sour kerosine depentaniser stabiliser
off; Refinery gas (The complex combination obtained from the
- source_sentence: >-
How can an undertaking identify and leverage opportunities related to
sustainability matters within its business model and strategy?
sentences:
- >-
i.
focusses on specific activities, business relationships, geographies or
other factors that give rise to heightened risk of adverse impacts;
ii.
considers the impacts with which the undertaking is involved through its
own operations or as a result of its business relationships;
iii.
includes consultation with affected stakeholders to understand how they
may be impacted and with external experts;
iv.
prioritises negative impacts based on their relative severity and
likelihood, (see ESRS 1 section 3.4 Impact materiality) and, if
applicable, positive impacts on their relative scale, scope and
likelihood, and determines which sustainability matters are material for
reporting purposes, including the qualitative or quantitative thresholds
and other criteria used as prescribed by ESRS 1 section 3.4 Impact
materiality;
(c)
an overview of the process used to identify, assess, prioritise and
monitor risks and opportunities that have or may have financial effects
. The disclosure shall include:
i.
how the undertaking has considered the connections of its impacts and
dependencies with the risks and opportunities that may arise from those
impacts and dependencies;
ii.
►C1 how the undertaking assesses the likelihood, magnitude, and nature
of effects of the identified risk and opportunities (such as the
qualitative or quantitative thresholds and other criteria used as
prescribed by ESRS 1 section 3.5 Financial materiality); ◄
iii.
how the undertaking prioritises sustainability-related risks relative to
other types of risks, including its use of risk-assessment tools;
(d)
a description of the decision-making process and the related internal
control procedures;
(e)
the extent to which and how the process to identify, assess and manage
impacts and risks is integrated into the undertaking’s overall risk
management process and used to evaluate the undertaking’s overall risk
profile and risk management processes;
(f)
the extent to which and how the process to identify, assess and manage
opportunities is integrated into the undertaking’s overall management
process where applicable;
(g)
the input parameters it uses (for example, data sources, the scope of
operations covered and the detail used in assumptions); and
(h)
whether and how the process has changed compared to the prior reporting
period, when the process was modified for the last time and future
revision dates of the materiality assessment.
Disclosure Requirement IRO-2 – Disclosure Requirements in ESRS covered
by the undertaking’s sustainability statement
The undertaking shall report on the Disclosure Requirements complied
with in its sustainability statements.
The objective of this Disclosure Requirement is to provide an
understanding of the Disclosure Requirements included in the
undertaking’s sustainability statement and of the topics that have been
omitted as not material, as a result of the materiality assessment.
The undertaking shall include a list of the Disclosure Requirements
complied with in preparing the sustainability statement , following the
outcome of the materiality assessment (see ESRS 1 chapter 3), including
the page numbers and/or paragraphs where the related disclosures are
located in the sustainability statement. This may be presented as a
content index. The undertaking shall also include a table of all the
datapoints that derive from other EU legislation as listed in Appendix B
of this standard, indicating where they can be found in the
sustainability statement and including those that the undertaking has
assessed as not material, in which case the undertaking shall indicate
‘Not material’ in the table in accordance with ESRS 1 paragraph 35.
If the undertaking concludes that climate change is not material and
therefore omits all disclosure requirements in ESRS E1 Climate change,
it shall disclose a detailed explanation of the conclusions of its
materiality assessment with regard to climate change (see ESRS 2 IRO-2
Disclosure Requirements in ESRS covered by the undertaking’s
sustainability statement), including a forward-looking analysis of the
conditions that could lead the undertaking to conclude that climate
change is material in the future.
If the undertaking concludes that a topic other than climate change is
not material and therefore omits all the Disclosure Requirements in the
corresponding topical ESRS, it may provide a brief explanation of the
conclusions of its materiality assessment for that topic.
- >-
(b)
the number and type of market participants, including the ratio of
market participants to traded instruments in a particular product;
(c)
the average size of spreads, where available;
(26)
‘competent authority’ means the authority, designated by each Member
State in accordance with Article 67, unless otherwise specified in this
Directive;
(27)
‘credit institution’ means a credit institution as defined in point (1)
of Article 4(1) of Regulation (EU) No 575/2013;
(28)
‘UCITS management company’ means a management company as defined in
point (b) of Article 2(1) of Directive 2009/65/EC of the European
Parliament and of the Council ( 4 );
(29)
- >-
(a)
a brief description of the undertaking’s business model and strategy,
including:
(i)
the resilience of the undertaking’s business model and strategy in
relation to risks related to sustainability matters;
(ii)
the opportunities for the undertaking related to sustainability matters;
(iii)
- source_sentence: >-
What are the conditions under which an undertaking with an average number
of 750 employees can omit certain sustainability information while still
needing to disclose the materiality assessment of those topics?
sentences:
- >-
(c)
impose restrictions on non-EU AIFMs relating to the management of an AIF
where its activities potentially constitute an important source of
counterparty risk to a credit institution or other systemically relevant
institutions.
5.
ESMA may take a decision under paragraph 4 and subject to the
requirements set out in paragraph 6 if both of the following conditions
are met:
(a)
a substantial threat exists, originating or aggravated by the activities
of AIFMs, to the orderly functioning and integrity of the financial
market or to the stability of the whole or a part of the financial
system in the Union and there are cross border implications; and
(b)
- >-
▼B
If an undertaking or group not exceeding on its balance sheet date the
average number of 750 employees during the financial year decides to
omit the information required by ESRS E4, ESRS S1, ESRS S2, ESRS S3 or
ESRS S4 in accordance with Appendix C of ESRS 1, it shall nevertheless
disclose whether the sustainability topics covered respectively by ESRS
E4, ESRS S1, ESRS S2, ESRS S3 and ESRS S4 have been assessed to be
material as a result of the undertaking’s materiality assessment. In
addition, if one or more of these topics has been assessed to be
material, the undertaking shall, for each material topic:
(a)
- >-
9.
The Commission shall establish and keep up-to-date a register of
recognised schemes. That register shall be made publicly available on a
free-access website. That website shall also allow for the collation of
feedback from all relevant stakeholders concerning the implementation of
recognised schemes. Such feedback shall be submitted to the relevant
scheme owners for consideration.
Article 31
Environmental footprint declaration
1.
- source_sentence: >-
What are the specific roles and responsibilities of the InvestEU Advisory
Hub in relation to project development assistance for public authorities
and project promoters?
sentences:
- >-
System B
Alternative characterisation Physical and chemical factors that
determine the characteristics of the coastal water and hence the
biological community structure and composition Obligatory factors
latitude longitude tidal range salinity Optional factors current
velocity wave exposure mean water temperature mixing characteristics
turbidity retention time (of enclosed bays) mean substratum composition
water temperature range
1.3. Establishment of type-specific reference conditions for surface
water body types
- >-
newly implemented since 31 December 2008 that continue to have an impact
in 2020 with respect to the obligation period referred to in paragraph
1, first subparagraph, point (a), and beyond 2020 with respect to the
period referred to in point (b)(i), of that subparagraph, and which can
be measured and verified; --- --- (e) count towards the amount of
required energy savings, energy savings that stem from policy measures,
provided that it can be demonstrated that those measures result in
individual actions carried out from 1 January 2018 to 31 December 2020
which deliver savings after 31 December 2020; --- --- (f) exclude from
the calculation of the amount of required energy savings pursuant to
paragraph 1, first subparagraph, points (a) and
- >-
Advisory initiatives shall be available as a component under each policy
window referred to in Article 8(1), covering sectors under that window.
In addition, advisory initiatives shall be available under a
cross-sectoral component.
2.
The InvestEU Advisory Hub shall in particular:
(a)
provide a central point of entry, managed and hosted by the Commission,
for project development assistance under the InvestEU Advisory Hub for
public authorities and for project promoters;
(b)
disseminate to public authorities and project promoters all available
additional information regarding the investment guidelines, including
information on their application or on the interpretation provided by
the Commission;
(c)
- source_sentence: >-
What is the definition of a preliminary economic assessment in the context
of evaluating projects for the recovery of critical raw materials?
sentences:
- >-
For the purposes of the first subparagraph of this paragraph, insurance
undertakings referred to in point (a) of the first subparagraph of
Article 1(3) of this Directive that are part of a group, on the basis of
financial relationships referred to in point (c)(ii) of Article 212(1)
of Directive 2009/138/EC, and which are subject to group supervision in
accordance with points (a) to (c) of Article 213(2) of that Directive
shall be treated as subsidiary undertakings of the parent undertaking of
that group.
9.
- >-
(a)
progress in the implementation of the Strategic Project, in particular
with regard to the permit-granting process;
(b)
where relevant, reasons for delays compared to the timetable referred to
in Article 7(1), point (c) and a plan to overcome such delays;
(c)
progress in financing the Strategic Project, including information on
public financial support.
The Commission shall submit a copy of the report referred to in the
first subparagraph of this paragraph to the Board in order to facilitate
the discussions referred to in Article 36(7), point (c).
2.
The Commission may, where necessary, request additional information from
project promoters relevant to the implementation of the Strategic
Project to ascertain the continuing fulfilment of the criteria laid down
in Article 6(1).
3.
The project promoter shall notify the Commission of:
(a)
changes to the Strategic Project affecting its fulfilment of the
criteria laid down in Article 6(1);
(b)
changes in control of the undertakings involved in the Strategic Project
on a lasting basis, compared to the information referred to in Article
7(1), point (e).
4.
The Commission may adopt implementing acts establishing a single
template to be used by project promoters to provide all the information
required for the reports referred to in paragraph 1 of this Article. The
single template may indicate how the information referred to in
paragraph 1 of this Article is to be expressed. Those implementing acts
shall be adopted in accordance with the advisory procedure referred to
in Article 39(2).
The extent of documentation required to complete the single template
referred to in the first subparagraph shall be reasonable.
5.
- >-
(39)
‘preliminary economic assessment’ means an early-stage, conceptual
assessment of the potential economic viability of a project for the
recovery of critical raw materials from extractive waste;
(40)
‘magnetic resonance imaging device’ means a non-invasive medical device
that uses magnetic fields to make anatomical images or any other device
that uses magnetic fields to make images of the inside of object;
(41)
‘wind energy generator’ means the part of an onshore or offshore wind
turbine that converts the mechanical energy of the rotor into electrical
energy;
(42)
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v1.5
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.822517355870812
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9526109266525807
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9725324479323876
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9873226682764865
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.822517355870812
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.31753697555086025
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1945064895864775
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09873226682764866
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.822517355870812
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9526109266525807
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9725324479323876
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9873226682764865
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9140763784801484
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8895886335216252
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8902791958273809
name: Cosine Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v1.5
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What is the definition of a preliminary economic assessment in the context of evaluating projects for the recovery of critical raw materials?',
'(39)\n\n‘preliminary economic assessment’ means an early-stage, conceptual assessment of the potential economic viability of a project for the recovery of critical raw materials from extractive waste;\n\n(40)\n\n‘magnetic resonance imaging device’ means a non-invasive medical device that uses magnetic fields to make anatomical images or any other device that uses magnetic fields to make images of the inside of object;\n\n(41)\n\n‘wind energy generator’ means the part of an onshore or offshore wind turbine that converts the mechanical energy of the rotor into electrical energy;\n\n(42)',
'For the purposes of the first subparagraph of this paragraph, insurance undertakings referred to in point (a) of the first subparagraph of Article 1(3) of this Directive that are part of a group, on the basis of financial relationships referred to in point (c)(ii) of Article 212(1) of Directive 2009/138/EC, and which are subject to group supervision in accordance with points (a) to (c) of Article 213(2) of that Directive shall be treated as subsidiary undertakings of the parent undertaking of that group.\n\n9.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.8225 |
cosine_accuracy@3 | 0.9526 |
cosine_accuracy@5 | 0.9725 |
cosine_accuracy@10 | 0.9873 |
cosine_precision@1 | 0.8225 |
cosine_precision@3 | 0.3175 |
cosine_precision@5 | 0.1945 |
cosine_precision@10 | 0.0987 |
cosine_recall@1 | 0.8225 |
cosine_recall@3 | 0.9526 |
cosine_recall@5 | 0.9725 |
cosine_recall@10 | 0.9873 |
cosine_ndcg@10 | 0.9141 |
cosine_mrr@10 | 0.8896 |
cosine_map@100 | 0.8903 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 29,911 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 13 tokens
- mean: 41.63 tokens
- max: 252 tokens
- min: 4 tokens
- mean: 233.72 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 What measures must Member States take to ensure that workers who believe they have been discriminated against in terms of equal pay can establish their case before a competent authority or national court?
Article 18
Shift of burden of proof
1. Member States shall take the appropriate measures, in accordance with their national judicial systems, to ensure that, when workers who consider themselves wronged because the principle of equal pay has not been applied to them establish before a competent authority or national court facts from which it may be presumed that there has been direct or indirect discrimination, it shall be for the respondent to prove that there has been no direct or indirect discrimination in relation to pay.
2. Member States shall ensure that, in administrative procedures or court proceedings regarding alleged direct or indirect discrimination in relation to pay, where an employer has not implemented the pay transparency obligations set out in Articles 5, 6, 7, 9 and 10, it is for the employer to prove that there has been no such discrimination.
The first subparagraph of this paragraph shall not apply where the employer proves that the infringement of the obligati...What are the key considerations for recognizing and addressing discrimination in the context of compensation and penalties, particularly in relation to the gender pay gap?
discrimination, in particular for substantive and procedural purposes, including to recognise the existence of discrimination, to decide on the appropriate comparator, to assess the proportionality, and to determine, where relevant, the level of compensation awarded or penalties imposed. An intersectional approach is important for understanding and addressing the gender pay gap. This clarification should not change the scope of employers’ obligations in regard to the pay transparency measures under this Directive. In particular, employers should not be required to gather data related to protected grounds other than sex.
What is the process for aircraft operators and shipping companies regarding the surrendering of allowances in relation to their total emissions from the previous calendar year?
(b)
each aircraft operator surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 15;
(c)
each shipping company surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 3ge.
Member States, administering Member States and administering authorities in respect of a shipping company shall ensure that allowances surrendered in accordance with the first subparagraph are subsequently cancelled.
▼M15
3-e. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 6per_device_eval_batch_size
: 6num_train_epochs
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 6per_device_eval_batch_size
: 6per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Click to expand
Epoch | Step | Training Loss | cosine_ndcg@10 |
---|---|---|---|
0.0201 | 100 | - | 0.6629 |
0.0401 | 200 | - | 0.7746 |
0.0602 | 300 | - | 0.8233 |
0.0802 | 400 | - | 0.8515 |
0.1003 | 500 | 0.4694 | 0.8621 |
0.1203 | 600 | - | 0.8680 |
0.1404 | 700 | - | 0.8733 |
0.1604 | 800 | - | 0.8774 |
0.1805 | 900 | - | 0.8757 |
0.2006 | 1000 | 0.1568 | 0.8795 |
0.2206 | 1100 | - | 0.8808 |
0.2407 | 1200 | - | 0.8789 |
0.2607 | 1300 | - | 0.8796 |
0.2808 | 1400 | - | 0.8822 |
0.3008 | 1500 | 0.1015 | 0.8821 |
0.3209 | 1600 | - | 0.8814 |
0.3410 | 1700 | - | 0.8756 |
0.3610 | 1800 | - | 0.8822 |
0.3811 | 1900 | - | 0.8848 |
0.4011 | 2000 | 0.0836 | 0.8843 |
0.4212 | 2100 | - | 0.8841 |
0.4412 | 2200 | - | 0.8803 |
0.4613 | 2300 | - | 0.8851 |
0.4813 | 2400 | - | 0.8818 |
0.5014 | 2500 | 0.0865 | 0.8849 |
0.5215 | 2600 | - | 0.8877 |
0.5415 | 2700 | - | 0.8806 |
0.5616 | 2800 | - | 0.8832 |
0.5816 | 2900 | - | 0.8930 |
0.6017 | 3000 | 0.0842 | 0.8928 |
0.6217 | 3100 | - | 0.8882 |
0.6418 | 3200 | - | 0.8858 |
0.6619 | 3300 | - | 0.8863 |
0.6819 | 3400 | - | 0.8828 |
0.7020 | 3500 | 0.0669 | 0.8839 |
0.7220 | 3600 | - | 0.8835 |
0.7421 | 3700 | - | 0.8854 |
0.7621 | 3800 | - | 0.8839 |
0.7822 | 3900 | - | 0.8882 |
0.8022 | 4000 | 0.0695 | 0.8871 |
0.8223 | 4100 | - | 0.8854 |
0.8424 | 4200 | - | 0.8822 |
0.8624 | 4300 | - | 0.8847 |
0.8825 | 4400 | - | 0.8863 |
0.9025 | 4500 | 0.0575 | 0.8819 |
0.9226 | 4600 | - | 0.8815 |
0.9426 | 4700 | - | 0.8836 |
0.9627 | 4800 | - | 0.8862 |
0.9828 | 4900 | - | 0.8889 |
1.0 | 4986 | - | 0.8927 |
1.0028 | 5000 | 0.0712 | 0.8935 |
1.0229 | 5100 | - | 0.8890 |
1.0429 | 5200 | - | 0.8919 |
1.0630 | 5300 | - | 0.8949 |
1.0830 | 5400 | - | 0.8950 |
1.1031 | 5500 | 0.0485 | 0.8934 |
1.1231 | 5600 | - | 0.8964 |
1.1432 | 5700 | - | 0.8953 |
1.1633 | 5800 | - | 0.8942 |
1.1833 | 5900 | - | 0.8929 |
1.2034 | 6000 | 0.0465 | 0.8912 |
1.2234 | 6100 | - | 0.8890 |
1.2435 | 6200 | - | 0.8914 |
1.2635 | 6300 | - | 0.8847 |
1.2836 | 6400 | - | 0.8873 |
1.3037 | 6500 | 0.0324 | 0.8912 |
1.3237 | 6600 | - | 0.8956 |
1.3438 | 6700 | - | 0.8954 |
1.3638 | 6800 | - | 0.8946 |
1.3839 | 6900 | - | 0.8931 |
1.4039 | 7000 | 0.0205 | 0.8951 |
1.4240 | 7100 | - | 0.8967 |
1.4440 | 7200 | - | 0.8960 |
1.4641 | 7300 | - | 0.8943 |
1.4842 | 7400 | - | 0.9003 |
1.5042 | 7500 | 0.0489 | 0.8946 |
1.5243 | 7600 | - | 0.8986 |
1.5443 | 7700 | - | 0.8945 |
1.5644 | 7800 | - | 0.8960 |
1.5844 | 7900 | - | 0.8987 |
1.6045 | 8000 | 0.039 | 0.8991 |
1.6245 | 8100 | - | 0.8959 |
1.6446 | 8200 | - | 0.8948 |
1.6647 | 8300 | - | 0.8933 |
1.6847 | 8400 | - | 0.8926 |
1.7048 | 8500 | 0.0297 | 0.8937 |
1.7248 | 8600 | - | 0.8974 |
1.7449 | 8700 | - | 0.8977 |
1.7649 | 8800 | - | 0.8973 |
1.7850 | 8900 | - | 0.8989 |
1.8051 | 9000 | 0.0248 | 0.8974 |
1.8251 | 9100 | - | 0.8980 |
1.8452 | 9200 | - | 0.8970 |
1.8652 | 9300 | - | 0.8997 |
1.8853 | 9400 | - | 0.9007 |
1.9053 | 9500 | 0.0534 | 0.9009 |
1.9254 | 9600 | - | 0.9015 |
1.9454 | 9700 | - | 0.9014 |
1.9655 | 9800 | - | 0.9008 |
1.9856 | 9900 | - | 0.9024 |
2.0 | 9972 | - | 0.9052 |
2.0056 | 10000 | 0.0295 | 0.9041 |
2.0257 | 10100 | - | 0.9009 |
2.0457 | 10200 | - | 0.9030 |
2.0658 | 10300 | - | 0.9028 |
2.0858 | 10400 | - | 0.9051 |
2.1059 | 10500 | 0.027 | 0.9063 |
2.1260 | 10600 | - | 0.9059 |
2.1460 | 10700 | - | 0.9044 |
2.1661 | 10800 | - | 0.9024 |
2.1861 | 10900 | - | 0.9005 |
2.2062 | 11000 | 0.0201 | 0.8996 |
2.2262 | 11100 | - | 0.9037 |
2.2463 | 11200 | - | 0.9029 |
2.2663 | 11300 | - | 0.9047 |
2.2864 | 11400 | - | 0.9030 |
2.3065 | 11500 | 0.0097 | 0.9041 |
2.3265 | 11600 | - | 0.9011 |
2.3466 | 11700 | - | 0.9000 |
2.3666 | 11800 | - | 0.8972 |
2.3867 | 11900 | - | 0.8985 |
2.4067 | 12000 | 0.0165 | 0.8979 |
2.4268 | 12100 | - | 0.8996 |
2.4469 | 12200 | - | 0.9026 |
2.4669 | 12300 | - | 0.9034 |
2.4870 | 12400 | - | 0.9054 |
2.5070 | 12500 | 0.0165 | 0.9029 |
2.5271 | 12600 | - | 0.9052 |
2.5471 | 12700 | - | 0.9057 |
2.5672 | 12800 | - | 0.9059 |
2.5872 | 12900 | - | 0.9092 |
2.6073 | 13000 | 0.0144 | 0.9081 |
2.6274 | 13100 | - | 0.9095 |
2.6474 | 13200 | - | 0.9102 |
2.6675 | 13300 | - | 0.9113 |
2.6875 | 13400 | - | 0.9103 |
2.7076 | 13500 | 0.0159 | 0.9105 |
2.7276 | 13600 | - | 0.9073 |
2.7477 | 13700 | - | 0.9084 |
2.7677 | 13800 | - | 0.9080 |
2.7878 | 13900 | - | 0.9083 |
2.8079 | 14000 | 0.0183 | 0.9083 |
2.8279 | 14100 | - | 0.9070 |
2.8480 | 14200 | - | 0.9085 |
2.8680 | 14300 | - | 0.9078 |
2.8881 | 14400 | - | 0.9075 |
2.9081 | 14500 | 0.0257 | 0.9073 |
2.9282 | 14600 | - | 0.9098 |
2.9483 | 14700 | - | 0.9089 |
2.9683 | 14800 | - | 0.9097 |
2.9884 | 14900 | - | 0.9079 |
3.0 | 14958 | - | 0.9081 |
3.0084 | 15000 | 0.0144 | 0.9084 |
3.0285 | 15100 | - | 0.9083 |
3.0485 | 15200 | - | 0.9078 |
3.0686 | 15300 | - | 0.9079 |
3.0886 | 15400 | - | 0.9089 |
3.1087 | 15500 | 0.0082 | 0.9093 |
3.1288 | 15600 | - | 0.9098 |
3.1488 | 15700 | - | 0.9106 |
3.1689 | 15800 | - | 0.9103 |
3.1889 | 15900 | - | 0.9110 |
3.2090 | 16000 | 0.0185 | 0.9117 |
3.2290 | 16100 | - | 0.9116 |
3.2491 | 16200 | - | 0.9125 |
3.2692 | 16300 | - | 0.9111 |
3.2892 | 16400 | - | 0.9109 |
3.3093 | 16500 | 0.0105 | 0.9125 |
3.3293 | 16600 | - | 0.9117 |
3.3494 | 16700 | - | 0.9118 |
3.3694 | 16800 | - | 0.9117 |
3.3895 | 16900 | - | 0.9137 |
3.4095 | 17000 | 0.019 | 0.9134 |
3.4296 | 17100 | - | 0.9129 |
3.4497 | 17200 | - | 0.9126 |
3.4697 | 17300 | - | 0.9133 |
3.4898 | 17400 | - | 0.9136 |
3.5098 | 17500 | 0.0109 | 0.9120 |
3.5299 | 17600 | - | 0.9124 |
3.5499 | 17700 | - | 0.9122 |
3.5700 | 17800 | - | 0.9129 |
3.5901 | 17900 | - | 0.9132 |
3.6101 | 18000 | 0.0207 | 0.9139 |
3.6302 | 18100 | - | 0.9134 |
3.6502 | 18200 | - | 0.9135 |
3.6703 | 18300 | - | 0.9139 |
3.6903 | 18400 | - | 0.9141 |
3.7104 | 18500 | 0.0105 | 0.9139 |
3.7304 | 18600 | - | 0.9138 |
3.7505 | 18700 | - | 0.9136 |
3.7706 | 18800 | - | 0.9141 |
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.1
- PyTorch: 2.4.0+cu121
- Accelerate: 1.4.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}