gsvann
/

mt5-small-finetuned-amazon-en-de

Text2Text Generation

Transformers

PyTorch

mt5

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

gsvann commited on Oct 29, 2023

Commit

70aa57d

1 Parent(s): b4815db

gsvann/mt5-small-finetuned-amazon-toys-en-de

Browse files

Files changed (3) hide show

README.md +14 -14
pytorch_model.bin +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.6132
-- Rouge1: 18.5886
-- Rouge2: 10.4761
-- Rougel: 17.9705
-- Rougelsum: 18.2467
 ## Model description
@@ -40,9 +40,9 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5.6e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -52,12 +52,12 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
-| 3.0822        | 1.0   | 651  | 2.6232          | 17.8834 | 11.2967 | 17.4489 | 17.73     |
-| 2.9301        | 2.0   | 1302 | 2.6265          | 18.3784 | 11.1587 | 18.0152 | 18.1896   |
-| 2.8552        | 3.0   | 1953 | 2.6223          | 18.7889 | 11.1004 | 18.2222 | 18.4583   |
-| 2.8253        | 4.0   | 2604 | 2.6080          | 18.2031 | 10.2714 | 17.748  | 17.8395   |
-| 2.8044        | 5.0   | 3255 | 2.6154          | 18.5675 | 10.5794 | 17.967  | 18.2698   |
-| 2.7928        | 6.0   | 3906 | 2.6132          | 18.5886 | 10.4761 | 17.9705 | 18.2467   |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.5965
+- Rouge1: 19.1764
+- Rouge2: 10.6855
+- Rougel: 18.7602
+- Rougelsum: 18.8956
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5.5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
+| 2.9041        | 1.0   | 1301 | 2.6000          | 17.3749 | 10.0728 | 16.9903 | 17.0336   |
+| 2.744         | 2.0   | 2602 | 2.5874          | 17.7266 | 9.2481  | 17.2785 | 17.3827   |
+| 2.6641        | 3.0   | 3903 | 2.6001          | 19.0052 | 10.6312 | 18.7604 | 18.754    |
+| 2.6189        | 4.0   | 5204 | 2.6012          | 18.834  | 10.1299 | 18.4209 | 18.5351   |
+| 2.6029        | 5.0   | 6505 | 2.5944          | 19.3375 | 10.537  | 18.8614 | 19.0826   |
+| 2.6086        | 6.0   | 7806 | 2.5965          | 19.1764 | 10.6855 | 18.7602 | 18.8956   |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dc66756f726ca2bb65ee3f851abac731eb68f8e31657faf0c0ce42470aa45354
 size 1200773058

 version https://git-lfs.github.com/spec/v1
+oid sha256:b50776badb8287353e3cd15c0fbdd307c70b97eba1df4f71840d2dd841f49a04
 size 1200773058

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0c3bf9535f955a89e82e4cc1478fe509fc4302a1aa4ff898439dda5668decf5e
 size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:49de751304fd243924c782ec9c3d405a206b4d21a73e7a1c31f29b5e28d60210
 size 4664