SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v1.5

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-m-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What is the definition of a preliminary economic assessment in the context of evaluating projects for the recovery of critical raw materials?',
    '(39)\n\n‘preliminary economic assessment’ means an early-stage, conceptual assessment of the potential economic viability of a project for the recovery of critical raw materials from extractive waste;\n\n(40)\n\n‘magnetic resonance imaging device’ means a non-invasive medical device that uses magnetic fields to make anatomical images or any other device that uses magnetic fields to make images of the inside of object;\n\n(41)\n\n‘wind energy generator’ means the part of an onshore or offshore wind turbine that converts the mechanical energy of the rotor into electrical energy;\n\n(42)',
    'For the purposes of the first subparagraph of this paragraph, insurance undertakings referred to in point (a) of the first subparagraph of Article 1(3) of this Directive that are part of a group, on the basis of financial relationships referred to in point (c)(ii) of Article 212(1) of Directive 2009/138/EC, and which are subject to group supervision in accordance with points (a) to (c) of Article 213(2) of that Directive shall be treated as subsidiary undertakings of the parent undertaking of that group.\n\n9.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8225
cosine_accuracy@3 0.9526
cosine_accuracy@5 0.9725
cosine_accuracy@10 0.9873
cosine_precision@1 0.8225
cosine_precision@3 0.3175
cosine_precision@5 0.1945
cosine_precision@10 0.0987
cosine_recall@1 0.8225
cosine_recall@3 0.9526
cosine_recall@5 0.9725
cosine_recall@10 0.9873
cosine_ndcg@10 0.9141
cosine_mrr@10 0.8896
cosine_map@100 0.8903

Training Details

Training Dataset

Unnamed Dataset

  • Size: 29,911 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 13 tokens
    • mean: 41.63 tokens
    • max: 252 tokens
    • min: 4 tokens
    • mean: 233.72 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    What measures must Member States take to ensure that workers who believe they have been discriminated against in terms of equal pay can establish their case before a competent authority or national court? Article 18

    Shift of burden of proof

    1. Member States shall take the appropriate measures, in accordance with their national judicial systems, to ensure that, when workers who consider themselves wronged because the principle of equal pay has not been applied to them establish before a competent authority or national court facts from which it may be presumed that there has been direct or indirect discrimination, it shall be for the respondent to prove that there has been no direct or indirect discrimination in relation to pay.

    2. Member States shall ensure that, in administrative procedures or court proceedings regarding alleged direct or indirect discrimination in relation to pay, where an employer has not implemented the pay transparency obligations set out in Articles 5, 6, 7, 9 and 10, it is for the employer to prove that there has been no such discrimination.

    The first subparagraph of this paragraph shall not apply where the employer proves that the infringement of the obligati...
    What are the key considerations for recognizing and addressing discrimination in the context of compensation and penalties, particularly in relation to the gender pay gap? discrimination, in particular for substantive and procedural purposes, including to recognise the existence of discrimination, to decide on the appropriate comparator, to assess the proportionality, and to determine, where relevant, the level of compensation awarded or penalties imposed. An intersectional approach is important for understanding and addressing the gender pay gap. This clarification should not change the scope of employers’ obligations in regard to the pay transparency measures under this Directive. In particular, employers should not be required to gather data related to protected grounds other than sex.
    What is the process for aircraft operators and shipping companies regarding the surrendering of allowances in relation to their total emissions from the previous calendar year? (b)

    each aircraft operator surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 15;

    (c)

    each shipping company surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 3ge.

    Member States, administering Member States and administering authorities in respect of a shipping company shall ensure that allowances surrendered in accordance with the first subparagraph are subsequently cancelled.

    ▼M15

    3-e.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • num_train_epochs: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss cosine_ndcg@10
0.0201 100 - 0.6629
0.0401 200 - 0.7746
0.0602 300 - 0.8233
0.0802 400 - 0.8515
0.1003 500 0.4694 0.8621
0.1203 600 - 0.8680
0.1404 700 - 0.8733
0.1604 800 - 0.8774
0.1805 900 - 0.8757
0.2006 1000 0.1568 0.8795
0.2206 1100 - 0.8808
0.2407 1200 - 0.8789
0.2607 1300 - 0.8796
0.2808 1400 - 0.8822
0.3008 1500 0.1015 0.8821
0.3209 1600 - 0.8814
0.3410 1700 - 0.8756
0.3610 1800 - 0.8822
0.3811 1900 - 0.8848
0.4011 2000 0.0836 0.8843
0.4212 2100 - 0.8841
0.4412 2200 - 0.8803
0.4613 2300 - 0.8851
0.4813 2400 - 0.8818
0.5014 2500 0.0865 0.8849
0.5215 2600 - 0.8877
0.5415 2700 - 0.8806
0.5616 2800 - 0.8832
0.5816 2900 - 0.8930
0.6017 3000 0.0842 0.8928
0.6217 3100 - 0.8882
0.6418 3200 - 0.8858
0.6619 3300 - 0.8863
0.6819 3400 - 0.8828
0.7020 3500 0.0669 0.8839
0.7220 3600 - 0.8835
0.7421 3700 - 0.8854
0.7621 3800 - 0.8839
0.7822 3900 - 0.8882
0.8022 4000 0.0695 0.8871
0.8223 4100 - 0.8854
0.8424 4200 - 0.8822
0.8624 4300 - 0.8847
0.8825 4400 - 0.8863
0.9025 4500 0.0575 0.8819
0.9226 4600 - 0.8815
0.9426 4700 - 0.8836
0.9627 4800 - 0.8862
0.9828 4900 - 0.8889
1.0 4986 - 0.8927
1.0028 5000 0.0712 0.8935
1.0229 5100 - 0.8890
1.0429 5200 - 0.8919
1.0630 5300 - 0.8949
1.0830 5400 - 0.8950
1.1031 5500 0.0485 0.8934
1.1231 5600 - 0.8964
1.1432 5700 - 0.8953
1.1633 5800 - 0.8942
1.1833 5900 - 0.8929
1.2034 6000 0.0465 0.8912
1.2234 6100 - 0.8890
1.2435 6200 - 0.8914
1.2635 6300 - 0.8847
1.2836 6400 - 0.8873
1.3037 6500 0.0324 0.8912
1.3237 6600 - 0.8956
1.3438 6700 - 0.8954
1.3638 6800 - 0.8946
1.3839 6900 - 0.8931
1.4039 7000 0.0205 0.8951
1.4240 7100 - 0.8967
1.4440 7200 - 0.8960
1.4641 7300 - 0.8943
1.4842 7400 - 0.9003
1.5042 7500 0.0489 0.8946
1.5243 7600 - 0.8986
1.5443 7700 - 0.8945
1.5644 7800 - 0.8960
1.5844 7900 - 0.8987
1.6045 8000 0.039 0.8991
1.6245 8100 - 0.8959
1.6446 8200 - 0.8948
1.6647 8300 - 0.8933
1.6847 8400 - 0.8926
1.7048 8500 0.0297 0.8937
1.7248 8600 - 0.8974
1.7449 8700 - 0.8977
1.7649 8800 - 0.8973
1.7850 8900 - 0.8989
1.8051 9000 0.0248 0.8974
1.8251 9100 - 0.8980
1.8452 9200 - 0.8970
1.8652 9300 - 0.8997
1.8853 9400 - 0.9007
1.9053 9500 0.0534 0.9009
1.9254 9600 - 0.9015
1.9454 9700 - 0.9014
1.9655 9800 - 0.9008
1.9856 9900 - 0.9024
2.0 9972 - 0.9052
2.0056 10000 0.0295 0.9041
2.0257 10100 - 0.9009
2.0457 10200 - 0.9030
2.0658 10300 - 0.9028
2.0858 10400 - 0.9051
2.1059 10500 0.027 0.9063
2.1260 10600 - 0.9059
2.1460 10700 - 0.9044
2.1661 10800 - 0.9024
2.1861 10900 - 0.9005
2.2062 11000 0.0201 0.8996
2.2262 11100 - 0.9037
2.2463 11200 - 0.9029
2.2663 11300 - 0.9047
2.2864 11400 - 0.9030
2.3065 11500 0.0097 0.9041
2.3265 11600 - 0.9011
2.3466 11700 - 0.9000
2.3666 11800 - 0.8972
2.3867 11900 - 0.8985
2.4067 12000 0.0165 0.8979
2.4268 12100 - 0.8996
2.4469 12200 - 0.9026
2.4669 12300 - 0.9034
2.4870 12400 - 0.9054
2.5070 12500 0.0165 0.9029
2.5271 12600 - 0.9052
2.5471 12700 - 0.9057
2.5672 12800 - 0.9059
2.5872 12900 - 0.9092
2.6073 13000 0.0144 0.9081
2.6274 13100 - 0.9095
2.6474 13200 - 0.9102
2.6675 13300 - 0.9113
2.6875 13400 - 0.9103
2.7076 13500 0.0159 0.9105
2.7276 13600 - 0.9073
2.7477 13700 - 0.9084
2.7677 13800 - 0.9080
2.7878 13900 - 0.9083
2.8079 14000 0.0183 0.9083
2.8279 14100 - 0.9070
2.8480 14200 - 0.9085
2.8680 14300 - 0.9078
2.8881 14400 - 0.9075
2.9081 14500 0.0257 0.9073
2.9282 14600 - 0.9098
2.9483 14700 - 0.9089
2.9683 14800 - 0.9097
2.9884 14900 - 0.9079
3.0 14958 - 0.9081
3.0084 15000 0.0144 0.9084
3.0285 15100 - 0.9083
3.0485 15200 - 0.9078
3.0686 15300 - 0.9079
3.0886 15400 - 0.9089
3.1087 15500 0.0082 0.9093
3.1288 15600 - 0.9098
3.1488 15700 - 0.9106
3.1689 15800 - 0.9103
3.1889 15900 - 0.9110
3.2090 16000 0.0185 0.9117
3.2290 16100 - 0.9116
3.2491 16200 - 0.9125
3.2692 16300 - 0.9111
3.2892 16400 - 0.9109
3.3093 16500 0.0105 0.9125
3.3293 16600 - 0.9117
3.3494 16700 - 0.9118
3.3694 16800 - 0.9117
3.3895 16900 - 0.9137
3.4095 17000 0.019 0.9134
3.4296 17100 - 0.9129
3.4497 17200 - 0.9126
3.4697 17300 - 0.9133
3.4898 17400 - 0.9136
3.5098 17500 0.0109 0.9120
3.5299 17600 - 0.9124
3.5499 17700 - 0.9122
3.5700 17800 - 0.9129
3.5901 17900 - 0.9132
3.6101 18000 0.0207 0.9139
3.6302 18100 - 0.9134
3.6502 18200 - 0.9135
3.6703 18300 - 0.9139
3.6903 18400 - 0.9141
3.7104 18500 0.0105 0.9139
3.7304 18600 - 0.9138
3.7505 18700 - 0.9136
3.7706 18800 - 0.9141

Framework Versions

  • Python: 3.10.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.1
  • PyTorch: 2.4.0+cu121
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
8,160
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for fjavigv/snoweu_v2

Finetuned
(11)
this model

Evaluation results