SentenceTransformer based on Lajavaness/bilingual-embedding-base

This is a sentence-transformers model finetuned from Lajavaness/bilingual-embedding-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Lajavaness/bilingual-embedding-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BilingualModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("BjarneNPO-01_09_2025_15_47_05")
# Run inference
queries = [
    "Ein Vater taucht nicht auf bei den Eltern im Elternbeirat \r\n\r\nAu\u00dferdem auf die Kinder mit archivierten Angeh\u00f6rigen hingewiesen und ihr gezeigt",
]
documents = [
    'Weil er keinen Zugang zur EAPP hat, Außerdem auf die Kinder mit archivierten Angehörigen hingewiesen und ihr gezeigt wie sie das lösen kann',
    '1. Vorlage da. Userin auch gezeigt wie sie die verwanden kann\r\n2. Als Wunsch weitergegeben.',
    'In der Kinderliste haben Kinder gefehlt. Userin muss die Daten in der Kinderliste hinterlegen.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.6142, 0.1926, 0.0079]])

Evaluation

Metrics

Information Retrieval

  • Dataset: Lajavaness/bilingual-embedding-base
  • Evaluated with scripts.InformationRetrievalEvaluatorCustom.InformationRetrievalEvaluatorCustom with these parameters:
    {
        "query_prompt_name": "query",
        "corpus_prompt_name": "query"
    }
    
Metric Value
cosine_accuracy@1 0.1159
cosine_accuracy@3 0.6667
cosine_accuracy@5 0.7246
cosine_accuracy@10 0.8406
cosine_precision@1 0.1159
cosine_precision@3 0.3333
cosine_precision@5 0.3188
cosine_precision@10 0.213
cosine_recall@1 0.0152
cosine_recall@3 0.0791
cosine_recall@5 0.1184
cosine_recall@10 0.1622
cosine_ndcg@10 0.2458
cosine_mrr@10 0.3924
cosine_map@100 0.1543

Training Details

Training Dataset

Unnamed Dataset

  • Size: 72,349 training samples
  • Columns: query and answer
  • Approximate statistics based on the first 1000 samples:
    query answer
    type string string
    details
    • min: 6 tokens
    • mean: 39.05 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 28.66 tokens
    • max: 238 tokens
  • Samples:
    query answer
    Nun ist die Monatsmeldung erfolgt, aber rote Ausrufezeichen tauchen auf. Userin an das JA verwiesen, diese müssten ihr die Schloss-Monate zur Überarbeitung im Kibiz.web zurückgeben. Userin dazu empfohlen, die Kinder die nicht in kitaplus sind, aber in Kibiz.web - im KiBiz.web zu entfernen, wenn diese nicht vorhanden sind.
    Die Feiertage in den Stammdaten stimmen nicht. Es besteht bereits ein Ticket dafür.
    Abrechnung kann nicht final freigegeben werden, es wird aber keiner Fehlermeldung angeziegt im Hintergrund ist eine Fehlermeldung zu sehen. An Entwickler weitergeleitet.

    Korrektur vorgenommen.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 4
  • learning_rate: 4e-05
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.08
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 4e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.08
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Lajavaness/bilingual-embedding-base_cosine_ndcg@10
0.0088 10 1.8616 -
0.0177 20 1.7114 -
0.0265 30 1.6738 -
0.0354 40 1.4837 -
0.0442 50 1.3178 -
0.0531 60 1.371 -
0.0619 70 1.3043 -
0.0708 80 1.2882 -
0.0796 90 1.2437 -
0.0885 100 1.119 -
0.0973 110 1.0922 -
0.1061 120 1.0818 -
0.1150 130 1.1949 -
0.1238 140 1.1136 -
0.1327 150 1.0514 -
0.1415 160 1.0428 -
0.1504 170 0.9703 -
0.1592 180 0.943 -
0.1681 190 0.9617 -
0.1769 200 0.9115 -
0.1858 210 0.9085 -
0.1946 220 0.965 -
0.2034 230 0.8556 -
0.2123 240 0.882 -
0.2211 250 0.8998 -
0.2300 260 0.7686 -
0.2388 270 0.7966 -
0.2477 280 0.8272 -
0.2565 290 0.7621 -
0.2654 300 0.8954 -
0.2742 310 0.8417 -
0.2831 320 0.7558 -
0.2919 330 0.7681 -
0.3008 340 0.7095 -
0.3096 350 0.8214 -
0.3184 360 0.7662 -
0.3273 370 0.7945 -
0.3361 380 0.7047 -
0.3450 390 0.7196 -
0.3538 400 0.7414 -
0.3627 410 0.8297 -
0.3715 420 0.6885 -
0.3804 430 0.7414 -
0.3892 440 0.6659 -
0.3981 450 0.6698 -
0.4069 460 0.797 -
0.4157 470 0.6491 -
0.4246 480 0.6641 -
0.4334 490 0.6027 -
0.4423 500 0.5888 -
0.4511 510 0.6977 -
0.4600 520 0.654 -
0.4688 530 0.6339 -
0.4777 540 0.6638 -
0.4865 550 0.6627 -
0.4954 560 0.646 -
0.5042 570 0.5376 -
0.5130 580 0.6076 -
0.5219 590 0.5645 -
0.5307 600 0.5923 -
0.5396 610 0.6137 -
0.5484 620 0.5811 -
0.5573 630 0.5592 -
0.5661 640 0.5504 -
0.5750 650 0.5623 -
0.5838 660 0.6452 -
0.5927 670 0.6171 -
0.6015 680 0.6278 -
0.6103 690 0.5743 -
0.6192 700 0.5452 -
0.6280 710 0.4978 -
0.6369 720 0.578 -
0.6457 730 0.6289 -
0.6546 740 0.566 -
0.6634 750 0.5309 -
0.6723 760 0.5971 -
0.6811 770 0.5543 -
0.6900 780 0.5014 -
0.6988 790 0.5802 -
0.7077 800 0.505 -
0.7165 810 0.5157 -
0.7253 820 0.5305 -
0.7342 830 0.5251 -
0.7430 840 0.5914 -
0.7519 850 0.4978 -
0.7607 860 0.564 -
0.7696 870 0.6057 -
0.7784 880 0.5818 -
0.7873 890 0.5446 -
0.7961 900 0.4906 -
0.8050 910 0.5329 -
0.8138 920 0.5824 -
0.8226 930 0.4795 -
0.8315 940 0.5256 -
0.8403 950 0.5095 -
0.8492 960 0.537 -
0.8580 970 0.5647 -
0.8669 980 0.4999 -
0.8757 990 0.5893 -
0.8846 1000 0.545 -
0.8934 1010 0.4825 -
0.9023 1020 0.4818 -
0.9111 1030 0.4534 -
0.9199 1040 0.4446 -
0.9288 1050 0.4458 -
0.9376 1060 0.4853 -
0.9465 1070 0.5503 -
0.9553 1080 0.5062 -
0.9642 1090 0.4939 -
0.9730 1100 0.5046 -
0.9819 1110 0.4483 -
0.9907 1120 0.3975 -
0.9996 1130 0.4642 -
1.0 1131 - 0.3157
1.0080 1140 0.4285 -
1.0168 1150 0.4174 -
1.0257 1160 0.3378 -
1.0345 1170 0.3701 -
1.0433 1180 0.3748 -
1.0522 1190 0.4158 -
1.0610 1200 0.4101 -
1.0699 1210 0.3781 -
1.0787 1220 0.3894 -
1.0876 1230 0.376 -
1.0964 1240 0.3801 -
1.1053 1250 0.3979 -
1.1141 1260 0.3966 -
1.1230 1270 0.3756 -
1.1318 1280 0.4148 -
1.1406 1290 0.3799 -
1.1495 1300 0.3587 -
1.1583 1310 0.3533 -
1.1672 1320 0.431 -
1.1760 1330 0.3523 -
1.1849 1340 0.3849 -
1.1937 1350 0.3786 -
1.2026 1360 0.4214 -
1.2114 1370 0.3716 -
1.2203 1380 0.3705 -
1.2291 1390 0.3937 -
1.2379 1400 0.3357 -
1.2468 1410 0.3813 -
1.2556 1420 0.3437 -
1.2645 1430 0.3964 -
1.2733 1440 0.3486 -
1.2822 1450 0.3466 -
1.2910 1460 0.4585 -
1.2999 1470 0.4045 -
1.3087 1480 0.3246 -
1.3176 1490 0.3596 -
1.3264 1500 0.463 -
1.3352 1510 0.3828 -
1.3441 1520 0.4033 -
1.3529 1530 0.3536 -
1.3618 1540 0.3519 -
1.3706 1550 0.3802 -
1.3795 1560 0.341 -
1.3883 1570 0.403 -
1.3972 1580 0.356 -
1.4060 1590 0.387 -
1.4149 1600 0.2879 -
1.4237 1610 0.3129 -
1.4326 1620 0.3645 -
1.4414 1630 0.3047 -
1.4502 1640 0.3532 -
1.4591 1650 0.3941 -
1.4679 1660 0.3864 -
1.4768 1670 0.3459 -
1.4856 1680 0.3508 -
1.4945 1690 0.4104 -
1.5033 1700 0.3375 -
1.5122 1710 0.3382 -
1.5210 1720 0.3999 -
1.5299 1730 0.3569 -
1.5387 1740 0.3038 -
1.5475 1750 0.4384 -
1.5564 1760 0.3983 -
1.5652 1770 0.2834 -
1.5741 1780 0.3116 -
1.5829 1790 0.3986 -
1.5918 1800 0.3071 -
1.6006 1810 0.3731 -
1.6095 1820 0.3758 -
1.6183 1830 0.3577 -
1.6272 1840 0.3512 -
1.6360 1850 0.3402 -
1.6448 1860 0.304 -
1.6537 1870 0.4238 -
1.6625 1880 0.3789 -
1.6714 1890 0.3876 -
1.6802 1900 0.3903 -
1.6891 1910 0.3227 -
1.6979 1920 0.3305 -
1.7068 1930 0.3499 -
1.7156 1940 0.3752 -
1.7245 1950 0.3484 -
1.7333 1960 0.3431 -
1.7421 1970 0.3493 -
1.7510 1980 0.3575 -
1.7598 1990 0.3271 -
1.7687 2000 0.3677 -
1.7775 2010 0.2797 -
1.7864 2020 0.3162 -
1.7952 2030 0.2937 -
1.8041 2040 0.385 -
1.8129 2050 0.3424 -
1.8218 2060 0.3946 -
1.8306 2070 0.3037 -
1.8395 2080 0.2947 -
1.8483 2090 0.3514 -
1.8571 2100 0.3068 -
1.8660 2110 0.3146 -
1.8748 2120 0.347 -
1.8837 2130 0.2636 -
1.8925 2140 0.3446 -
1.9014 2150 0.2878 -
1.9102 2160 0.3289 -
1.9191 2170 0.3331 -
1.9279 2180 0.2465 -
1.9368 2190 0.3153 -
1.9456 2200 0.288 -
1.9544 2210 0.3376 -
1.9633 2220 0.3161 -
1.9721 2230 0.3392 -
1.9810 2240 0.369 -
1.9898 2250 0.3523 -
1.9987 2260 0.3278 -
2.0 2262 - 0.2584
2.0071 2270 0.2417 -
2.0159 2280 0.2456 -
2.0248 2290 0.2598 -
2.0336 2300 0.2601 -
2.0425 2310 0.2264 -
2.0513 2320 0.2535 -
2.0602 2330 0.2115 -
2.0690 2340 0.2711 -
2.0778 2350 0.2276 -
2.0867 2360 0.2686 -
2.0955 2370 0.2395 -
2.1044 2380 0.2729 -
2.1132 2390 0.2992 -
2.1221 2400 0.2424 -
2.1309 2410 0.2666 -
2.1398 2420 0.2342 -
2.1486 2430 0.2476 -
2.1575 2440 0.2902 -
2.1663 2450 0.2151 -
2.1751 2460 0.2207 -
2.1840 2470 0.2382 -
2.1928 2480 0.2389 -
2.2017 2490 0.2233 -
2.2105 2500 0.251 -
2.2194 2510 0.2016 -
2.2282 2520 0.2424 -
2.2371 2530 0.282 -
2.2459 2540 0.2559 -
2.2548 2550 0.2756 -
2.2636 2560 0.2355 -
2.2724 2570 0.2513 -
2.2813 2580 0.2527 -
2.2901 2590 0.2063 -
2.2990 2600 0.2197 -
2.3078 2610 0.2401 -
2.3167 2620 0.2773 -
2.3255 2630 0.2237 -
2.3344 2640 0.2128 -
2.3432 2650 0.2226 -
2.3521 2660 0.2638 -
2.3609 2670 0.2707 -
2.3697 2680 0.2553 -
2.3786 2690 0.2217 -
2.3874 2700 0.2469 -
2.3963 2710 0.2152 -
2.4051 2720 0.2151 -
2.4140 2730 0.2327 -
2.4228 2740 0.2947 -
2.4317 2750 0.1757 -
2.4405 2760 0.2609 -
2.4494 2770 0.2221 -
2.4582 2780 0.2089 -
2.4670 2790 0.2426 -
2.4759 2800 0.2414 -
2.4847 2810 0.1975 -
2.4936 2820 0.2701 -
2.5024 2830 0.2581 -
2.5113 2840 0.2544 -
2.5201 2850 0.2889 -
2.5290 2860 0.2458 -
2.5378 2870 0.2306 -
2.5467 2880 0.2588 -
2.5555 2890 0.2373 -
2.5644 2900 0.2202 -
2.5732 2910 0.2209 -
2.5820 2920 0.2358 -
2.5909 2930 0.1734 -
2.5997 2940 0.252 -
2.6086 2950 0.2345 -
2.6174 2960 0.266 -
2.6263 2970 0.2557 -
2.6351 2980 0.205 -
2.6440 2990 0.2916 -
2.6528 3000 0.2462 -
2.6617 3010 0.2953 -
2.6705 3020 0.2263 -
2.6793 3030 0.2357 -
2.6882 3040 0.243 -
2.6970 3050 0.2269 -
2.7059 3060 0.2431 -
2.7147 3070 0.239 -
2.7236 3080 0.1974 -
2.7324 3090 0.2343 -
2.7413 3100 0.253 -
2.7501 3110 0.2201 -
2.7590 3120 0.1923 -
2.7678 3130 0.2184 -
2.7766 3140 0.2426 -
2.7855 3150 0.207 -
2.7943 3160 0.2164 -
2.8032 3170 0.2062 -
2.8120 3180 0.2367 -
2.8209 3190 0.2759 -
2.8297 3200 0.2488 -
2.8386 3210 0.2222 -
2.8474 3220 0.2385 -
2.8563 3230 0.2378 -
2.8651 3240 0.2552 -
2.8739 3250 0.2267 -
2.8828 3260 0.2856 -
2.8916 3270 0.2385 -
2.9005 3280 0.2444 -
2.9093 3290 0.2225 -
2.9182 3300 0.3305 -
2.9270 3310 0.2349 -
2.9359 3320 0.266 -
2.9447 3330 0.2506 -
2.9536 3340 0.2426 -
2.9624 3350 0.2204 -
2.9713 3360 0.2202 -
2.9801 3370 0.2577 -
2.9889 3380 0.2664 -
2.9978 3390 0.2185 -
3.0 3393 - 0.2458
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.2
  • PyTorch: 2.8.0+cu129
  • Accelerate: 1.10.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
16
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BjarneNPO/BjarneNPO-01_09_2025_15_47_05

Finetuned
(3)
this model

Evaluation results

  • Cosine Accuracy@1 on Lajavaness/bilingual embedding base
    self-reported
    0.116
  • Cosine Accuracy@3 on Lajavaness/bilingual embedding base
    self-reported
    0.667
  • Cosine Accuracy@5 on Lajavaness/bilingual embedding base
    self-reported
    0.725
  • Cosine Accuracy@10 on Lajavaness/bilingual embedding base
    self-reported
    0.841
  • Cosine Precision@1 on Lajavaness/bilingual embedding base
    self-reported
    0.116
  • Cosine Precision@3 on Lajavaness/bilingual embedding base
    self-reported
    0.333
  • Cosine Precision@5 on Lajavaness/bilingual embedding base
    self-reported
    0.319
  • Cosine Precision@10 on Lajavaness/bilingual embedding base
    self-reported
    0.213
  • Cosine Recall@1 on Lajavaness/bilingual embedding base
    self-reported
    0.015
  • Cosine Recall@3 on Lajavaness/bilingual embedding base
    self-reported
    0.079