abdoutony207
/

m2m100_418M-evaluated-en-to-ar-2000instancesUNMULTI-leaningRate2e-05-batchSize8-regu2

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

m2m100_418M-evaluated-en-to-ar-2000instancesUNMULTI-leaningRate2e-05-batchSize8-regu2

This model is a fine-tuned version of facebook/m2m100_418M on the un_multi dataset. It achieves the following results on the evaluation set:

Loss: 0.3642
Bleu: 40.8245
Meteor: 0.4272
Gen Len: 41.8075

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 11
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Meteor	Gen Len
5.1584	0.5	100	3.2518	30.3723	0.3633	41.5
2.1351	1.0	200	0.9929	32.9915	0.3833	41.8225
0.568	1.5	300	0.4312	33.705	0.3896	42.6225
0.3749	2.0	400	0.3697	36.9316	0.4084	40.57
0.2376	2.5	500	0.3587	37.6782	0.4124	41.99
0.2435	3.0	600	0.3529	37.9931	0.4128	42.02
0.1706	3.5	700	0.3531	39.9972	0.4252	41.8025
0.165	4.0	800	0.3514	39.3155	0.42	41.0275
0.1273	4.5	900	0.3606	40.0765	0.4234	41.6175
0.1307	5.0	1000	0.3550	40.4468	0.428	41.72
0.0926	5.5	1100	0.3603	40.5454	0.4307	41.765
0.1096	6.0	1200	0.3613	40.5691	0.4298	42.31
0.0826	6.5	1300	0.3642	40.8245	0.4272	41.8075

Framework versions

Transformers 4.20.1
Pytorch 1.11.0
Datasets 2.1.0
Tokenizers 0.12.1

Downloads last month: 125

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported Inference Providers.

Dataset used to train abdoutony207/m2m100_418M-evaluated-en-to-ar-2000instancesUNMULTI-leaningRate2e-05-batchSize8-regu2

Evaluation results

Bleu on un_multi
self-reported

40.825

View on Papers With Code