helsinki_new_ver5.0

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ZH on the sarahwei/Taiwanese-Minnan-Example-Sentences dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1331
  • Bleu: 24.8564
  • Ter: 51.5767

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 15000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Ter
0.3275 0.5656 1000 0.2091 1.0420 80.6027
0.2661 1.1312 2000 0.1750 8.8029 63.5599
0.2414 1.6968 3000 0.1592 14.6322 59.3833
0.2165 2.2624 4000 0.1512 17.7486 55.9776
0.2079 2.8281 5000 0.1457 19.9557 54.1135
0.1953 3.3937 6000 0.1433 21.4108 53.4688
0.188 3.9593 7000 0.1410 21.9998 52.9643
0.1804 4.5249 8000 0.1379 22.9829 52.6279
0.1797 5.0905 9000 0.1367 23.1706 52.3196
0.1745 5.6561 10000 0.1355 23.3668 51.8290
0.1662 6.2217 11000 0.1343 24.1466 51.7169
0.1691 6.7873 12000 0.1342 24.4332 51.9972
0.1631 7.3529 13000 0.1335 24.4473 51.6608
0.1646 7.9186 14000 0.1330 24.6473 51.6748
0.1599 8.4842 15000 0.1331 24.8564 51.5767

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
77.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Curiousfox/helsinki_new_ver5.0