SentenceTransformer based on cointegrated/LaBSE-en-ru

This is a sentence-transformers model finetuned from cointegrated/LaBSE-en-ru. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: cointegrated/LaBSE-en-ru
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
  (3): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Solomennikova/labse_funetuned_hoff_40_epochs")
# Run inference
sentences = [
    'качели для дачи',
    '{"product_name": "Детский игровой комплекс Капризун", "Бренд": "NATIONAL TREE COMPANY", "Цвет": "белый, бирюзовый", "Материал": "массив сосны, металл, пластмасса", "description": "Детский игровой комплекс-кровать Капризун сделан из натурального дерева и рассчитан на детей в возрасте от 3 лет. В конструкции предусмотрены два спальных места, множество игровых элементов и спортивных снарядов. Игры с комплексом развивают воображение, улучшают координацию движений и ловкость, укрепляют мышцы.\\n Особенности:\\n • сделан из экологически чистого материала;\\n • поверхность дерева гладко отшлифована и покрыта краской на водной основе;\\n • текстиль и матрас в комплект не входят.", "Производитель": "Россия"}',
    '{"product_name": "Поддон универсальный MELODIA DELLA VITA Round MTYRD8080Bk 80х16 см", "Бренд": "MELODIA DELLA VITA", "Цвет": "чёрный", "Материал": "акрил", "description": "", "Производитель": "Россия"}',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 86,732 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 3 tokens
    • mean: 5.25 tokens
    • max: 18 tokens
    • min: 52 tokens
    • mean: 125.86 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    комод {"product_name": "Комплект стульев 305 54х75х54 см", "Бренд": null, "Цвет": "Коричневый", "Материал": null, "description": "", "Производитель": "Россия"}
    freya {"product_name": "Светильник подвесной FREYA Modern Blossom 12.5 кв.м., 31х170х31 см, G9", "Бренд": "FREYA", "Цвет": "Белый,Золотой", "Материал": null, "description": "", "Производитель": "Китай"}
    комод {"product_name": "Комод Деко", "Бренд": null, "Цвет": "Белый", "Материал": null, "description": "Комод Деко создан для тех, кто требует от мебели и функциональности, и элегантности. В конструкции модели предусмотрены выдвижные ящики различного размера и отделение с полками за распашной дверцей. В этом комоде найдётся место для самых разнообразных вещей: например, в трёх нижних ящиках будет удобно хранить домашний текстиль, одежду, коробки с обувью, в верхнем — косметику. Крышка, покрытая стеклом, идеальна как для размещения стильных интерьерных аксессуаров, так и для установки телевизионной панели. Модель изготовлена в минималистичном стиле, изысканную изюминку придаёт сочетание глянцевого фасада и сверкающей стеклянной поверхности.", "Производитель": "Россия"}
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 40
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 40
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss
0.1844 500 3.1542
0.3689 1000 2.8525
0.5533 1500 2.7196
0.7377 2000 2.623
0.9222 2500 2.6181
1.1066 3000 2.5299
1.2910 3500 2.4974
1.4755 4000 2.4634
1.6599 4500 2.4221
1.8443 5000 2.4188
2.0288 5500 2.3779
2.2132 6000 2.3458
2.3976 6500 2.2998
2.5821 7000 2.3419
2.7665 7500 2.314
2.9509 8000 2.3115
3.1354 8500 2.2327
3.3198 9000 2.2278
3.5042 9500 2.2319
3.6887 10000 2.2344
3.8731 10500 2.2274
4.0575 11000 2.1902
4.2420 11500 2.1161
4.4264 12000 2.1232
4.6108 12500 2.1025
4.7953 13000 2.1322
4.9797 13500 2.1355
5.1641 14000 2.0072
5.3486 14500 1.9984
5.5330 15000 2.0017
5.7174 15500 2.0018
5.9019 16000 2.023
6.0863 16500 1.948
6.2707 17000 1.8868
6.4552 17500 1.8973
6.6396 18000 1.8953
6.8241 18500 1.9176
7.0085 19000 1.8969
7.1929 19500 1.7614
7.3774 20000 1.8054
7.5618 20500 1.7984
7.7462 21000 1.8033
7.9307 21500 1.7945
8.1151 22000 1.7153
8.2995 22500 1.6833
8.4840 23000 1.7055
8.6684 23500 1.7067
8.8528 24000 1.7123
9.0373 24500 1.6876
9.2217 25000 1.5714
9.4061 25500 1.5801
9.5906 26000 1.6204
9.7750 26500 1.6273
9.9594 27000 1.6214
10.1439 27500 1.5054
10.3283 28000 1.5077
10.5127 28500 1.5251
10.6972 29000 1.5242
10.8816 29500 1.55
11.0660 30000 1.4983
11.2505 30500 1.4049
11.4349 31000 1.42
11.6193 31500 1.4335
11.8038 32000 1.4651
11.9882 32500 1.4767
12.1726 33000 1.3289
12.3571 33500 1.3423
12.5415 34000 1.3575
12.7259 34500 1.3881
12.9104 35000 1.3993
13.0948 35500 1.3113
13.2792 36000 1.2785
13.4637 36500 1.2948
13.6481 37000 1.3153
13.8325 37500 1.3315
14.0170 38000 1.3091
14.2014 38500 1.1891
14.3858 39000 1.2345
14.5703 39500 1.2325
14.7547 40000 1.2673
14.9391 40500 1.2739
15.1236 41000 1.1863
15.3080 41500 1.1756
15.4924 42000 1.1876
15.6769 42500 1.1958
15.8613 43000 1.1924
16.0457 43500 1.1628
16.2302 44000 1.1002
16.4146 44500 1.1179
16.5990 45000 1.1354
16.7835 45500 1.1722
16.9679 46000 1.1719
17.1523 46500 1.0824
17.3368 47000 1.0641
17.5212 47500 1.089
17.7056 48000 1.1128
17.8901 48500 1.0993
18.0745 49000 1.0653
18.2589 49500 1.0198
18.4434 50000 1.0576
18.6278 50500 1.072
18.8122 51000 1.0679
18.9967 51500 1.0758
19.1811 52000 0.9829
19.3655 52500 0.9923
19.5500 53000 1.0242
19.7344 53500 1.0281
19.9188 54000 1.0313
20.1033 54500 0.9858
20.2877 55000 0.97
20.4722 55500 0.9693
20.6566 56000 0.9955
20.8410 56500 0.9999
21.0255 57000 0.9898
21.2099 57500 0.9394
21.3943 58000 0.9383
21.5788 58500 0.9549
21.7632 59000 0.9501
21.9476 59500 0.9594
22.1321 60000 0.902
22.3165 60500 0.9162
22.5009 61000 0.9234
22.6854 61500 0.9385
22.8698 62000 0.9353
23.0542 62500 0.9291
23.2387 63000 0.8861
23.4231 63500 0.8928
23.6075 64000 0.9109
23.7920 64500 0.9189
23.9764 65000 0.8977
24.1608 65500 0.8676
24.3453 66000 0.8629
24.5297 66500 0.8845
24.7141 67000 0.8841
24.8986 67500 0.8827
25.0830 68000 0.8837
25.2674 68500 0.848
25.4519 69000 0.8475
25.6363 69500 0.8597
25.8207 70000 0.8751
26.0052 70500 0.8536
26.1896 71000 0.8133
26.3740 71500 0.8165
26.5585 72000 0.8371
26.7429 72500 0.8712
26.9273 73000 0.8397
27.1118 73500 0.8258
27.2962 74000 0.7895
27.4806 74500 0.8153
27.6651 75000 0.8106
27.8495 75500 0.8235
28.0339 76000 0.8348
28.2184 76500 0.7915
28.4028 77000 0.797
28.5872 77500 0.7934
28.7717 78000 0.7992
28.9561 78500 0.8105
29.1405 79000 0.7642
29.3250 79500 0.7824
29.5094 80000 0.783
29.6938 80500 0.7938
29.8783 81000 0.804
30.0627 81500 0.7783
30.2471 82000 0.7529
30.4316 82500 0.7587
30.6160 83000 0.775
30.8004 83500 0.7784
30.9849 84000 0.7864
31.1693 84500 0.7371
31.3537 85000 0.7563
31.5382 85500 0.7408
31.7226 86000 0.773
31.9070 86500 0.7777
32.0915 87000 0.7466
32.2759 87500 0.7413
32.4603 88000 0.7524
32.6448 88500 0.733
32.8292 89000 0.7512
33.0136 89500 0.7538
33.1981 90000 0.7174
33.3825 90500 0.7342
33.5669 91000 0.7357
33.7514 91500 0.7309
33.9358 92000 0.7359
34.1203 92500 0.7276
34.3047 93000 0.7165
34.4891 93500 0.7081
34.6736 94000 0.73
34.8580 94500 0.7364
35.0424 95000 0.7275
35.2269 95500 0.7132
35.4113 96000 0.694
35.5957 96500 0.7029
35.7802 97000 0.709
35.9646 97500 0.732
36.1490 98000 0.7107
36.3335 98500 0.7068
36.5179 99000 0.6942
36.7023 99500 0.7128
36.8868 100000 0.7043
37.0712 100500 0.6988
37.2556 101000 0.6948
37.4401 101500 0.7133
37.6245 102000 0.6913
37.8089 102500 0.6991
37.9934 103000 0.6983
38.1778 103500 0.6929
38.3622 104000 0.6825
38.5467 104500 0.6789
38.7311 105000 0.6948
38.9155 105500 0.6807
39.1000 106000 0.6978
39.2844 106500 0.6832
39.4688 107000 0.673
39.6533 107500 0.6867
39.8377 108000 0.6946

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.1
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
9
Safetensors
Model size
128M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Solomennikova/labse_funetuned_hoff_40_epochs

Finetuned
(9)
this model