opus-mt-cs-en-Prefix-Finetuned
This model is a fine-tuned version of Helsinki-NLP/opus-mt-cs-en on a dataset of Czech to English pairs of sentence prefixes (unfinished sentences). It is meant to improve text-to-text Simultaneous translation from Czech to English.
Before fine-tuning, it achieves the following results on the evaluation set:
- Loss: 1.2841
- Model Preparation Time: 0.0019
- Bleu: 55.8042
After fine-tuning, the best checkpoint at epoch 3 (saved here) achieves the following results on the evaluation set:
- Loss: 0.6869
- Model Preparation Time: 0.0019
- Bleu: 64.4592
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
There are always two random prefixes of the original sentence pair in the data. Train and eval contain prefixes from distinct original sentences.
Example of the data:
{"pref_source": "Respektoval jsem ho.", "pref_target": "I respected that man."}
{"pref_source": "Respektoval", "pref_target": "I respected"}
{"pref_source": "Společnost", "pref_target": "FxPro Global"}
{"pref_source": "Společnost FxPro Global Markets MENA Limited je autorizována a regulována Dubai Financial Services Authority (referenční č.", "pref_target": "FxPro Global Markets MENA Limited is authorised and regulated by the Dubai Financial Services Authority (reference"}
{"pref_source": "Jsi si jistá, že se tady cítíš", "pref_target": "Mm-hmm. Yeah."}
{"pref_source": "Jsi si jistá, že se tady", "pref_target": "Mm-hmm."}
{"pref_source": "Jsme v", "pref_target": "We're fine,"}
{"pref_source": "Jsme v pořádku Margaret .", "pref_target": "We're fine, Margaret."}
{"pref_source": "Svobodná", "pref_target": "Free"}
{"pref_source": "Svobodná vůle.", "pref_target": "Free will, and all."}
{"pref_source": "Všechny oběti", "pref_target": "All the victims"}
{"pref_source": "Všechny", "pref_target": "All the"}
Training data: ~1.734M prefixes
Evaluation data: 5k prefixes
Training procedure
Trained on NVIDIA H100 NVL 94GB.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 220
- eval_batch_size: 700
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Bleu |
---|---|---|---|---|---|
0.749 | 1.0 | 7881 | 0.7074 | 0.0019 | 63.3123 |
0.6925 | 2.0 | 15762 | 0.6927 | 0.0019 | 63.9972 |
0.6529 | 3.0 | 23643 | 0.6869 | 0.0019 | 64.4592 |
0.626 | 4.0 | 31524 | 0.6817 | 0.0019 | 63.8378 |
0.5989 | 5.0 | 39405 | 0.6820 | 0.0019 | 64.2718 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.7.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 3
Model tree for davidruda/opus-mt-cs-en-Prefix-Finetuned
Base model
Helsinki-NLP/opus-mt-cs-en