File size: 3,439 Bytes
2a491cb 6c2e675 2a491cb 6c2e675 2a491cb 6c2e675 2a491cb 0b55d8a 5d34672 80603cb 30d7869 2ad1178 c40ebf4 4ec8f6d 4300373 1442855 7a983a3 ea6a30f 4fb9d25 2a491cb 0b55d8a 5d34672 80603cb 30d7869 2ad1178 c40ebf4 4ec8f6d 4300373 1442855 7a983a3 ea6a30f 4fb9d25 2a491cb 6c2e675 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
license: apache-2.0
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- transformers
- generated_from_trainer
datasets:
- squad
- newsqa
- LLukas22/cqadupstack
- LLukas22/fiqa
- LLukas22/scidocs
- deepset/germanquad
- LLukas22/nq
language:
- en
- de
---
# paraphrase-multilingual-mpnet-base-v2-embedding-all
This model is a fine-tuned version of [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) on the following datasets: [squad](https://huggingface.co/datasets/squad), [newsqa](https://huggingface.co/datasets/newsqa), [LLukas22/cqadupstack](https://huggingface.co/datasets/LLukas22/cqadupstack), [LLukas22/fiqa](https://huggingface.co/datasets/LLukas22/fiqa), [LLukas22/scidocs](https://huggingface.co/datasets/LLukas22/scidocs), [deepset/germanquad](https://huggingface.co/datasets/deepset/germanquad), [LLukas22/nq](https://huggingface.co/datasets/LLukas22/nq).
## Usage (Sentence-Transformers)
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
```
pip install -U sentence-transformers
```
Then you can use the model like this:
```python
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('LLukas22/paraphrase-multilingual-mpnet-base-v2-embedding-all')
embeddings = model.encode(sentences)
print(embeddings)
```
## Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1E+00
- per device batch size: 40
- effective batch size: 120
- seed: 42
- optimizer: AdamW with betas (0.9,0.999) and eps 1E-08
- weight decay: 2E-02
- D-Adaptation: True
- Warmup: True
- number of epochs: 15
- mixed_precision_training: bf16
## Training results
| Epoch | Train Loss | Validation Loss |
| ----- | ---------- | --------------- |
| 0 | 0.085 | 0.0625 |
| 1 | 0.0598 | 0.0554 |
| 2 | 0.0484 | 0.0518 |
| 3 | 0.0405 | 0.0485 |
| 4 | 0.0341 | 0.0463 |
| 5 | 0.0287 | 0.0454 |
| 6 | 0.0243 | 0.0445 |
| 7 | 0.0207 | 0.0426 |
| 8 | 0.0177 | 0.0424 |
| 9 | 0.0153 | 0.0421 |
| 10 | 0.0134 | 0.0417 |
| 11 | 0.012 | 0.0411 |
| 12 | 0.011 | 0.0414 |
## Evaluation results
| Epoch | top_1 | top_3 | top_5 | top_10 | top_25 |
| ----- | ----- | ----- | ----- | ----- | ----- |
| 0 | 0.261 | 0.351 | 0.384 | 0.422 | 0.459 |
| 1 | 0.272 | 0.365 | 0.4 | 0.439 | 0.477 |
| 2 | 0.276 | 0.37 | 0.404 | 0.443 | 0.481 |
| 3 | 0.292 | 0.391 | 0.426 | 0.465 | 0.503 |
| 4 | 0.295 | 0.395 | 0.431 | 0.47 | 0.51 |
| 5 | 0.299 | 0.4 | 0.437 | 0.476 | 0.514 |
| 6 | 0.306 | 0.404 | 0.44 | 0.478 | 0.515 |
| 7 | 0.309 | 0.41 | 0.445 | 0.485 | 0.521 |
| 8 | 0.31 | 0.411 | 0.448 | 0.487 | 0.524 |
| 9 | 0.315 | 0.417 | 0.454 | 0.493 | 0.529 |
| 10 | 0.319 | 0.42 | 0.457 | 0.495 | 0.53 |
| 11 | 0.323 | 0.424 | 0.46 | 0.497 | 0.531 |
| 12 | 0.324 | 0.427 | 0.464 | 0.501 | 0.536 |
## Framework versions
- Transformers: 4.25.1
- PyTorch: 2.0.0.dev20230210+cu118
- PyTorch Lightning: 1.8.6
- Datasets: 2.7.1
- Tokenizers: 0.13.1
- Sentence Transformers: 2.2.2
## Additional Information
This model was trained as part of my Master's Thesis **'Evaluation of transformer based language models for use in service information systems'**. The source code is available on [Github](https://github.com/LLukas22/Retrieval-Augmented-QA). |