A static embedding model tokenized with dbmdz/bert-base-german-uncased and mainly built on DE/EN-datasets as a base for further experiments.

This is a sentence-transformers model trained on 74 datasets (full list at the bottom). It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Further explanations of how to build such a model, you can find in the Static Embeddings blogpost by Tom Aarsen in January 2025. It took me until the end of May to start this tiny spare time experiment.

After some tests with different tokenizers I decided to pick one of the oldest as it has performed best by delivering the smallest size (~240MB) – bert-base-german-uncased by the dbmdz-team.

  • 99% performance: Unexpectedly this model scored nearly 99% in comparison to e5-base-sts-en-de during the GermanGovServiceRetrieval-Task in MTEB by taking only a 80th of the time (40.3 seconds vs. 0.49).
  • Matryoshka: This model was trained with a Matryoshka loss, allowing you to truncate the embeddings for faster retrieval at minimal performance costs.
  • Evaluations: See Evaluations for details on performance on German MTEB, special GermanGovService retrieval, embedding speed, and Matryoshka dimensionality truncation.
  • Training Script: See base_train.py for the training script used to train this model from scratch (be warned - it is wildly commented).

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: inf tokens
  • Output Dimensionality: 2048 dimensions
  • Similarity Function: Cosine Similarity
  • Training Datasets:
  • Languages: de, en
  • License: eupl-1.2

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): StaticEmbedding(
    (embedding): EmbeddingBag(31102, 2048, mode='mean')
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("MarcGrumpyOlejak/sts-mrl-en-de-base-v1")
# Run inference
queries = [
    "Im April 1928 beschrieb er in seinem Artikel On the Construction of Tables by Interpolation die Verwendung von Lochkartenger\u00e4ten zum Interpolieren von Datentabellen und verglich dies mit den weniger effizienten und fehleranf\u00e4lligeren Methoden mit mechanischen Ger\u00e4ten wie den Windradrechnern unter dem Markennamen Brunsviga.",
]
documents = [
    'Im April 1928 beschrieb er in seinem Artikel „On the Construction of Tables by Interpolation“ („Über die Erstellung von Tabellen durch Interpolation“) die Interpolation von Daten in Tabellen mit Hilfe von Lochkarten und verglich diese Methode mit dem uneffizienteren und fehleranfälligeren Verfahren, das mechanische Rechner verwendet.',
    'POLES liefert nicht die direkten makro-ökonomischen Auswirkungen der Minderungsmaßnahmen wie im Stern-Report vorgesehen, erlaubt jedoch eine detaillierte Abschätzung der Kosten im Zusammenhang mit Techniken mit wenig Energieverbrauch oder Nullenergietechniken.',
    'Im Lehrbuch Maschinenelemente – Funktion, Gestaltung und Berechnung von Decker (bisher 19 Auflagen) wird anhand praktischer Anwendungen mit Z88 die Berechnung von Maschinenelementen mit der Finiten-Elemente-Analyse gelehrt.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 2048] [3, 2048]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7737, 0.1275, 0.1184]])

Out-of-Scope Use

After several tests it is not really good in reranking. Also everything about "news" is really low due the fact, that there is no open licensed and commercially usable dataset available. Maybe you have knowledge about an official free and open licensed news-based dataset. Feel free to contact me.

Evaluation

All steps and evaluations have been made locally on my very small hardware using a Nvidia RTX 2070 SUPER (8 GB) - no joke.

This model has been benchmarked using mainly the GermanGovServiceRetrieval task, developed by the Munich city administration. It associates questions with a textual context containing the answer. The idea is to train it further on based upon german based administraional classification datasets. After the first results the full german MTEB(deu, v1) has also been tested as the GermanGovServiceRetrieval test is not a part of the german MTEB-benchmark. While testing with NanoBEIR it has been shown to be a bit insufficient for testing bilingual german/english - but I accidentally outscored static-similarity-mrl-multilingual-v1 by 0.03 points ;)

As for the static embeddings being built using Model2Vec, with alikia2x/jina-embedding-v3-m2v-1024 I picked the largest one I could find with ~1GB size.

This model is compared against the excellent e5-base-sts-en-de model made by Daniel Heinz back in 2024 (ca. 1.1GB). The second model for comparisons with dense embeddings is the optimized granite-embedding-107m-multilingual model made by the IBM-Granite-team (ca. 770MB).

Benchmark details

Oops - I forgot to NanoBEIR granite-embedding-107m-multilingual - that's for the week-end.

NanoBEIR MTEB MTEB(deu, v1) – avg
Dense Embeddings NanoBEIR_mean_cosine_ndcg@10 GermanGovServiceRetrieval Naive (sum/num)
e5-base-sts-en-de 0,5320 0,7931 0,5194
granite-embedding-107m-multilingual
0,7880 0,4992
Static Embeddings


static-retrieval-mrl-en-v1(*) 0,5035 0,6630 0,3716
jina-embedding-v3-m2v-1024 0,3480 0,7260 0,4081
static-similarity-mrl-multilingual-v1 0,4350 0,7281 0,4259
sts-mrl-en-de-base-v1 0,4680 0,7841 0,4566

((*)'static-retrieval-mrl-en-v1' only for comparison to mainly english based NanoBEIR)

MTEB - GermanGovServiceRetrieval Evaluation

As e5-base-sts-en-de scores with 0.7931 in the GermanGovServiceRetrieval task, that means sts-mrl-en-de-base-v1 with 0.7841 achieves 98.865% for the same task by using only ~230MB RAM and a CPU.

So it is only 0,4949% behind granite-embedding-107m-multilingual.

Time in seconds vs. MTEB ermanGovServiceRetrieval Scores

MTEB(deu, v1) – avg

For the german version of the MTEB benchmark MTEB(deu, v1) the results are not as significant as the GermanGovServiceRetrieval task - but with 87,909% of quality in comparison to e5-base-sts-en-de you can use sts-mrl-en-de-base-v1 for example to mine hard negatives in a really short time instead of burning money with a whole bunch of GPU.

Even with the really well speed optimised granite-embedding-107m-multilingual being almost as fast as the static embeddings, you'll still need a GPU.

Time in seconds vs. MTEB(deu, v1) - average scores

Matryoshka Evaluation

(have to be checked twice - looks like almost everyone has a glitch in the results … the results are better with a first reduction from 2048 down to 1024 dimensions? That's the 2nd thing for the week-end.)

Training Datasets

Sadly all details of the datasets had to be saved in a seperate file details_datasets.md as this README.md has a limit.

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4096
  • per_device_eval_batch_size: 4096
  • learning_rate: 0.2
  • num_train_epochs: 1
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4096
  • per_device_eval_batch_size: 4096
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.2
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss mmarco 3hn loss mmarco 2hn loss mmarco 1hn loss mmarco 0hn loss wp-22-12-de loss swim ir de loss swim ir de 3hn loss swim ir de title 3hn loss swim ir de title loss avemio triples loss avemio pairs 3hn loss avemio pairs 0hn loss nq german en de a 3hn loss nq german en de 3hn loss nq german 3hn loss nq german 1hn loss german oasst1 hn loss germanrag short loss slimorca dedup 3hn loss slimorca dedup 2hn loss slimorca dedup 1hn loss slimorca dedup 0hn loss german gpt4 3hn loss german orca dpo loss alpaca gpt4 3hn loss alpaca gpt4 0hn loss dolly context de 3hn loss dolly context ende 3hn loss dolly instructions de 3hn loss dolly instructions de 0hn loss dolly instructions ende 3hn loss dolly responses de 3hn loss dolly responses de 0hn loss dolly responses ende 3hn loss saf legal de loss gls 3hn loss gls 2hn loss gls 1hn loss gls 0hn loss europarl 3hn loss europarl 0hn loss tatoeba 3hn loss tatoeba 0hn loss wikimatrix 3hn loss wikipedia abstract 3hn loss wikipedia abstract 0hn loss wiktionary gdg de 3hn loss wiktionary gdg de short loss wmt24pp loss synthia de loss gbp 3hn loss gbp ende 3hn loss stbs de 3hn loss stbs en 3hn loss pawsx de loss pawsx en loss nli anli entail 3hn loss nli fever entail 3hn loss nli ling entail 3hn loss nli mnli entail 3hn loss nli wanli entail 3hn loss nli anli transl 3hn loss nli fever transl 3hn loss nli ling transl 3hn loss nli mnli transl 3hn loss nli wanli transl 3hn loss jina ai 3en loss jina ai ende loss jina ai dede loss polyglot de loss polyglot en loss tilde EESC loss miracl de 3hn loss miracl de 0hn loss
0.0002 1 32.2328 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.1211 500 17.4935 5.9441 18.6286 15.4380 21.7452 11.5899 15.7739 2.0470 6.4545 28.7021 3.4327 3.0953 21.4473 0.6579 2.3081 8.3028 17.1118 8.6341 0.7353 2.4550 10.1110 20.2165 15.1944 11.2822 14.0772 9.3205 12.3671 3.6399 0.2185 2.5653 27.0853 2.1334 4.3423 2.3262 0.5350 20.1312 6.6543 11.5668 11.2751 15.0010 2.5165 46.8575 6.4837 17.8191 0.9617 7.7542 3.3035 17.7944 4.9850 0.5039 6.9794 0.4971 1.7211 7.5595 5.8076 2.1527 0.4983 9.9586 7.6724 4.5647 4.4193 4.3135 0.8089 2.2057 0.8494 1.5787 2.4122 9.0588 1.6716 5.7378 17.4829 17.4252 2.7128 2.3019 4.9855
0.2421 1000 9.8434 5.9548 16.1939 13.6828 19.8400 10.3624 13.5662 1.7398 4.7552 26.9780 2.7763 2.6297 19.2160 0.6367 2.2657 8.1566 15.8885 7.0793 0.7799 1.6238 9.2113 18.4966 14.8541 11.6090 17.8812 7.3860 7.7746 3.2721 0.1734 2.2635 27.3627 1.7248 4.0169 3.3867 0.4930 19.1067 7.0229 13.2283 13.9238 17.1221 2.0835 47.1417 6.2599 15.3082 0.7972 6.9853 2.7917 15.1196 4.4008 0.1748 6.5392 0.4433 1.3500 7.5248 5.8447 2.1663 0.4949 9.3473 6.2105 4.2394 4.1746 4.2383 0.6806 2.1903 0.6338 1.3037 2.0331 5.0726 1.0650 5.8712 17.3595 16.0869 1.9498 2.1635 4.1986
0.3632 1500 9.4195 6.0462 15.3733 13.4579 19.1822 10.1358 13.7938 2.0818 4.3716 26.0843 2.7380 2.6063 18.9278 0.6317 2.1179 8.5954 15.0949 6.2069 0.8866 1.5936 9.0869 18.6605 14.5752 12.3640 15.1111 7.5786 8.6830 2.9134 0.1539 2.3901 24.0635 1.5851 3.0859 2.8681 0.4823 20.1934 6.9440 11.9040 11.6429 13.5179 1.9956 46.0385 6.0581 15.7130 0.7430 6.2928 2.9993 14.2742 4.1868 0.1639 5.8340 0.4744 1.3372 7.7122 5.6745 2.1703 0.4930 9.6020 6.0473 3.5016 3.7158 4.2441 0.5784 2.1883 0.5912 1.2164 1.9767 7.0197 1.0216 4.4556 14.8992 15.8563 1.8581 2.1515 4.4043
0.4843 2000 8.2114 5.8039 14.9131 12.9781 18.3934 9.9055 13.5402 2.0944 4.4961 26.2583 2.6002 2.5542 18.3124 0.5504 1.7278 8.4266 12.8837 5.5970 0.7967 1.5002 8.8843 18.2636 15.5366 12.1376 13.7508 6.1530 6.6779 2.2906 0.1435 1.8996 21.9520 1.5331 2.7177 3.0663 0.4214 19.7372 6.1346 10.9578 10.5089 13.6577 1.8838 46.2217 4.1247 12.9807 0.6397 6.3777 2.5970 13.7871 4.1784 0.1893 4.4490 0.4018 1.1374 7.1980 5.6566 2.1517 0.4921 9.2049 6.0599 3.4091 3.6662 4.0776 0.4841 2.0716 0.4860 0.9970 1.7709 7.5693 0.6321 4.9397 14.5334 15.4385 1.7821 1.9614 4.2582
0.6053 2500 8.038 5.5500 14.8000 12.8634 18.2342 9.7964 13.2195 1.9088 4.2172 25.7571 2.4768 2.4510 17.9053 0.4689 1.8237 8.1981 12.5957 6.0768 0.6939 1.5240 9.6936 18.5641 16.5833 12.5368 13.6839 6.6175 7.2916 2.3097 0.1377 1.9064 22.0331 1.5278 2.5185 4.8549 0.3997 20.1505 6.0001 10.3536 9.9127 12.7608 1.7728 46.1264 3.4876 13.2839 0.6246 6.0571 2.5264 13.6899 4.1796 0.1133 5.5862 0.3973 1.1315 7.0625 5.7281 2.1597 0.4939 9.3306 5.8505 3.0920 3.6364 4.2557 0.4513 1.9419 0.4341 0.7909 1.6440 7.5517 0.6997 4.9564 14.5145 15.7047 1.6838 1.9027 4.2791
0.7264 3000 8.4735 5.4690 14.0184 12.4418 17.2256 9.5584 12.8587 1.8026 4.2292 25.0699 2.4180 2.3386 17.5121 0.4924 1.7512 8.6264 12.9932 5.7242 0.7519 1.4209 8.7996 17.9024 15.0738 10.3888 12.8886 6.9268 7.5737 2.4082 0.1446 1.9202 22.0949 1.4499 2.7943 3.8219 0.4096 20.1391 5.9977 10.2577 9.9893 12.8969 1.8217 45.9583 3.6835 14.0661 0.6401 5.8992 2.4225 13.6148 4.0275 0.1058 4.2324 0.4046 1.1448 7.2012 5.7275 2.1669 0.4947 8.9883 5.8919 3.4086 3.5578 3.8109 0.4713 2.0382 0.4806 0.9071 1.7479 7.4633 0.6957 5.1938 14.2104 15.6664 1.7301 1.9228 4.1841
0.8475 3500 7.7352 5.3754 14.0426 12.5198 17.3227 9.4857 12.9446 1.8784 4.2447 25.1068 2.3991 2.3495 17.5300 0.4642 1.6235 8.4671 12.8252 5.3035 0.7126 1.4499 8.4552 16.9827 14.6279 10.8074 12.8392 6.5745 7.2679 2.4318 0.1319 1.8556 22.2088 1.3227 2.6365 4.3796 0.3783 20.1810 5.9464 10.2856 9.9382 12.6812 1.6933 46.2977 3.6286 13.8749 0.5844 5.8990 2.4661 13.3314 4.0382 0.1148 4.3655 0.4017 1.0360 7.1329 5.7121 2.1640 0.4945 8.9242 5.6470 3.2758 3.5739 4.0207 0.4303 1.9566 0.4515 0.8112 1.6914 7.4063 0.6659 5.2429 13.9946 15.6856 1.5650 1.8613 4.3350
0.9685 4000 7.4739 5.3820 13.9713 12.4551 17.2949 9.4687 12.9339 1.9303 4.2006 25.0763 2.3880 2.3362 17.4705 0.4638 1.6235 8.3594 12.6393 5.3609 0.7168 1.4452 8.3913 16.8145 14.9649 10.7862 12.5774 6.6076 7.1481 2.3770 0.1320 1.8618 22.2842 1.3191 2.6045 4.6015 0.3718 14.6598 5.9303 10.1947 9.8502 12.5003 1.6814 46.1385 3.6696 13.8947 0.5799 5.8546 2.4445 13.3022 4.0359 0.1090 4.4493 0.3932 1.0395 7.1369 5.6920 2.1641 0.4943 8.9089 5.6356 3.2438 3.5664 4.0016 0.4297 1.9810 0.4511 0.8123 1.6705 7.4795 0.6834 5.2668 13.9481 15.6508 1.5442 1.8556 4.3036

Framework Versions

  • Python: 3.10.15
  • Sentence Transformers: 5.0.0
  • Transformers: 4.51.3
  • PyTorch: 2.1.0+cu121
  • Accelerate: 1.3.0
  • Datasets: 2.21.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

GermanGovServiceRetrieval

@software{lhm-dienstleistungen-qa,
  author = {Schröder, Leon Marius and
Gutknecht, Clemens and
Alkiddeh, Oubada and
Susanne Weiß,
Lukas, Leon},
  month = nov,
  publisher = {it@M},
  title = {LHM-Dienstleistungen-QA - german public domain question-answering dataset},
  url = {https://huggingface.co/datasets/it-at-m/LHM-Dienstleistungen-QA},
  year = {2022},
}

MMTEB

@article{enevoldsen2025mmtebmassivemultilingualtext,
  title={MMTEB: Massive Multilingual Text Embedding Benchmark},
  author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2502.13595},
  year={2025},
  url={https://arxiv.org/abs/2502.13595},
  doi = {10.48550/arXiv.2502.13595},
}

MTEB

@article{muennighoff2022mteb,
  author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo{\"\i}c and Reimers, Nils},
  title = {MTEB: Massive Text Embedding Benchmark},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2210.07316},
  year = {2022}
  url = {https://arxiv.org/abs/2210.07316},
  doi = {10.48550/ARXIV.2210.07316},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train MarcGrumpyOlejak/sts-mrl-en-de-base-v1