---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:523982
- loss:MSELoss
base_model: FacebookAI/xlm-roberta-base
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- negative_mse
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on FacebookAI/xlm-roberta-base
results:
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: mse en ua
type: mse-en-ua
metrics:
- type: negative_mse
value: -1.1089269071817398
name: Negative Mse
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts17 en en
type: sts17-en-en
metrics:
- type: pearson_cosine
value: 0.6784819487397877
name: Pearson Cosine
- type: spearman_cosine
value: 0.7308493185913256
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts17 en ua
type: sts17-en-ua
metrics:
- type: pearson_cosine
value: 0.592555339963418
name: Pearson Cosine
- type: spearman_cosine
value: 0.6197606373137193
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts17 ua ua
type: sts17-ua-ua
metrics:
- type: pearson_cosine
value: 0.6158998595292998
name: Pearson Cosine
- type: spearman_cosine
value: 0.6445750755380512
name: Spearman Cosine
license: mit
datasets:
- sentence-transformers/parallel-sentences-talks
- sentence-transformers/parallel-sentences-tatoeba
- sentence-transformers/parallel-sentences-wikimatrix
language:
- uk
- en
---
# SentenceTransformer based on FacebookAI/xlm-roberta-base
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
👉 Check out the model on [GitHub](https://github.com/panalexeu/xlm-roberta-ua-distilled).
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:** [parallel-sentences-talks](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-talks), [parallel-sentences-wikimatrix](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-wikimatrix), [parallel-sentences-tatoeba](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-tatoeba)
- **Language:** Ukrainian, English
- **License:** MIT
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("panalexeu/xlm-roberta-ua-distilled")
# Run inference
sentences = [
"You'd better consult the doctor.",
'Краще проконсультуйся у лікаря.',
'Їх позначають як Aufklärungsfahrzeug 93 та Aufklärungsfahrzeug 97 відповідно.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Knowledge Distillation
* Dataset: `mse-en-ua`
* Evaluated with [MSEEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.MSEEvaluator)
| Metric | Value |
|:-----------------|:------------|
| **negative_mse** | **-1.1089** |
#### Semantic Similarity
* Datasets: `sts17-en-en`, `sts17-en-ua` and `sts17-ua-ua`
* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | sts17-en-en | sts17-en-ua | sts17-ua-ua |
|:--------------------|:------------|:------------|:------------|
| pearson_cosine | 0.6785 | 0.5926 | 0.6159 |
| **spearman_cosine** | **0.7308** | **0.6198** | **0.6446** |
## Training Details
### Training Dataset
* Dataset: [parallel-sentences-talks](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-talks), [parallel-sentences-wikimatrix](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-wikimatrix), [parallel-sentences-tatoeba](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-tatoeba)
* Size: 523,982 training samples
* Columns: english
, non_english
, and label
* Approximate statistics based on the first 1000 samples:
| | english | non_english | label |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------|
| type | string | string | list |
| details |
Her real name is Lydia (リディア, Ridia), but she was mistaken for a boy and called Ricard.
| Справжнє ім'я — Лідія, але її помилково сприйняли за хлопчика і назвали Рікард.
| [0.15217968821525574, -0.17830222845077515, -0.12677159905433655, 0.22082313895225525, 0.40085524320602417, ...]
|
| (Applause) So he didn't just learn water.
| (Аплодисменти) Він не тільки вивчив слово "вода".
| [-0.1058148592710495, -0.08846072107553482, -0.2684604823589325, -0.105219267308712, 0.3050258755683899, ...]
|
| It is tightly integrated with SAM, the Storage and Archive Manager, and hence is often referred to as SAM-QFS.
| Вона тісно інтегрована з SAM (Storage and Archive Manager), тому часто називається SAM-QFS.
| [0.03270340710878372, -0.45798248052597046, -0.20090211927890778, 0.006579531356692314, -0.03178019821643829, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Evaluation Dataset
* Dataset: [parallel-sentences-talks](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-talks), [parallel-sentences-wikimatrix](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-wikimatrix), [parallel-sentences-tatoeba](https://huggingface.co/datasets/sentence-transformers/parallel-sentences-tatoeba)
* Size: 3,838 evaluation samples
* Columns: english
, non_english
, and label
* Approximate statistics based on the first 1000 samples:
| | english | non_english | label |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------|
| type | string | string | list |
| details | I have lost my wallet.
| Я загубив гаманець.
| [-0.11186987161636353, -0.03419225662946701, -0.31304317712783813, 0.0838347002863884, 0.108644500374794, ...]
|
| It's a pharmaceutical product.
| Це фармацевтичний продукт.
| [0.04133488982915878, -0.4182000756263733, -0.30786487460136414, -0.09351564198732376, -0.023946482688188553, ...]
|
| We've all heard of the Casual Friday thing.
| Всі ми чули про «джинсову п’ятницю» (вільна форма одягу).
| [-0.10697802156209946, 0.21002227067947388, -0.2513434886932373, -0.3718843460083008, 0.06871984899044037, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `gradient_accumulation_steps`: 3
- `num_train_epochs`: 4
- `warmup_ratio`: 0.1
#### All Hyperparameters