MarcGrumpyOlejak
/

sts-mrl-en-de-base-v1

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

dataset_size:16753490

loss:MatryoshkaLoss

loss:MultipleNegativesRankingLoss

Model card Files Files and versions

MarcGrumpyOlejak commited on 22 days ago

Commit

befc207

·

verified ·

1 Parent(s): 3599ed0

Update README.md

fixed a typo not leading to the training script.

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ After some tests with different tokenizers I decided to pick one of the oldest a
 * **99% performance:** Unexpectedly this model scored nearly 99% in comparison to [e5-base-sts-en-de](https://huggingface.co/danielheinz/e5-base-sts-en-de) during the [GermanGovServiceRetrieval](https://huggingface.co/datasets/mteb/GermanGovServiceRetrieval)-Task in MTEB by taking only a 80th of the time (40.3 seconds vs. 0.49).
 * **Matryoshka:** This model was trained with a [Matryoshka loss](https://huggingface.co/blog/matryoshka), allowing you to truncate the embeddings for faster retrieval at minimal performance costs.
 * **Evaluations:** See [Evaluations](#evaluation) for details on performance on German MTEB, special [GermanGovService retrieval](https://huggingface.co/datasets/mteb/GermanGovServiceRetrieval), embedding speed, and Matryoshka dimensionality truncation.
-* **Training Script:** See [base_train.py](base_train.py) for the training script used to train this model from scratch (be warned - it is wildly commented).
 ## Model Details

 * **99% performance:** Unexpectedly this model scored nearly 99% in comparison to [e5-base-sts-en-de](https://huggingface.co/danielheinz/e5-base-sts-en-de) during the [GermanGovServiceRetrieval](https://huggingface.co/datasets/mteb/GermanGovServiceRetrieval)-Task in MTEB by taking only a 80th of the time (40.3 seconds vs. 0.49).
 * **Matryoshka:** This model was trained with a [Matryoshka loss](https://huggingface.co/blog/matryoshka), allowing you to truncate the embeddings for faster retrieval at minimal performance costs.
 * **Evaluations:** See [Evaluations](#evaluation) for details on performance on German MTEB, special [GermanGovService retrieval](https://huggingface.co/datasets/mteb/GermanGovServiceRetrieval), embedding speed, and Matryoshka dimensionality truncation.
+* **Training Script:** See [train_base.py](train_base.py) for the training script used to train this model from scratch (be warned - it is wildly commented).
 ## Model Details