opus-mt-cs-en-Prefix-Finetuned

This model is a fine-tuned version of Helsinki-NLP/opus-mt-cs-en on a dataset of Czech to English pairs of sentence prefixes (unfinished sentences). It is meant to improve text-to-text Simultaneous translation from Czech to English.

Before fine-tuning, it achieves the following results on the evaluation set:

  • Loss: 1.2841
  • Model Preparation Time: 0.0019
  • Bleu: 55.8042

After fine-tuning, the best checkpoint at epoch 3 (saved here) achieves the following results on the evaluation set:

  • Loss: 0.6869
  • Model Preparation Time: 0.0019
  • Bleu: 64.4592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

There are always two random prefixes of the original sentence pair in the data. Train and eval contain prefixes from distinct original sentences.

Example of the data:

{"pref_source": "Respektoval jsem ho.", "pref_target": "I respected that man."}
{"pref_source": "Respektoval", "pref_target": "I respected"}
{"pref_source": "Společnost", "pref_target": "FxPro Global"}
{"pref_source": "Společnost FxPro Global Markets MENA Limited je autorizována a regulována Dubai Financial Services Authority (referenční č.", "pref_target": "FxPro Global Markets MENA Limited is authorised and regulated by the Dubai Financial Services Authority (reference"}
{"pref_source": "Jsi si jistá, že se tady cítíš", "pref_target": "Mm-hmm. Yeah."}
{"pref_source": "Jsi si jistá, že se tady", "pref_target": "Mm-hmm."}
{"pref_source": "Jsme v", "pref_target": "We're fine,"}
{"pref_source": "Jsme v pořádku Margaret .", "pref_target": "We're fine, Margaret."}
{"pref_source": "Svobodná", "pref_target": "Free"}
{"pref_source": "Svobodná vůle.", "pref_target": "Free will, and all."}
{"pref_source": "Všechny oběti", "pref_target": "All the victims"}
{"pref_source": "Všechny", "pref_target": "All the"}

Training data: ~1.734M prefixes
Evaluation data: 5k prefixes

Training procedure

Trained on NVIDIA H100 NVL 94GB.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 220
  • eval_batch_size: 700
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Bleu
0.749 1.0 7881 0.7074 0.0019 63.3123
0.6925 2.0 15762 0.6927 0.0019 63.9972
0.6529 3.0 23643 0.6869 0.0019 64.4592
0.626 4.0 31524 0.6817 0.0019 63.8378
0.5989 5.0 39405 0.6820 0.0019 64.2718

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
76.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for davidruda/opus-mt-cs-en-Prefix-Finetuned

Finetuned
(1)
this model