floflodebilbao commited on
Commit
0a4f9c8
·
verified ·
1 Parent(s): e16b534

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 2.3630
26
- - Rouge1: 0.3861
27
- - Rouge2: 0.2049
28
- - Rougel: 0.3339
29
- - Rougelsum: 0.3363
30
- - Gen Len: 29.74
31
- - Bleu: 0.1229
32
- - Precisions: 0.1726
33
- - Brevity Penalty: 0.8619
34
- - Length Ratio: 0.8706
35
- - Translation Length: 1063.0
36
  - Reference Length: 1221.0
37
- - Precision: 0.8872
38
- - Recall: 0.8836
39
- - F1: 0.8853
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -56,7 +56,7 @@ More information needed
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
- - learning_rate: 0.001
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
@@ -70,16 +70,16 @@ The following hyperparameters were used during training:
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | 23.0724 | 1.0 | 7 | 18.7271 | 0.2503 | 0.076 | 0.1943 | 0.1945 | 30.9 | 0.0343 | 0.0682 | 0.934 | 0.9361 | 1143.0 | 1221.0 | 0.8565 | 0.8632 | 0.8598 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | 13.246 | 2.0 | 14 | 4.8705 | 0.0198 | 0.0029 | 0.0184 | 0.0181 | 31.0 | 0.0 | 0.0104 | 0.2822 | 0.4414 | 539.0 | 1221.0 | 0.7167 | 0.8073 | 0.7589 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | 4.9849 | 3.0 | 21 | 3.7922 | 0.2687 | 0.1089 | 0.2193 | 0.2207 | 30.8 | 0.062 | 0.0949 | 0.8785 | 0.8853 | 1081.0 | 1221.0 | 0.8534 | 0.8615 | 0.8574 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | 3.8357 | 4.0 | 28 | 3.2411 | 0.3386 | 0.1481 | 0.2722 | 0.2741 | 30.82 | 0.092 | 0.1339 | 0.861 | 0.8698 | 1062.0 | 1221.0 | 0.8629 | 0.8688 | 0.8658 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | 3.3584 | 5.0 | 35 | 3.0254 | 0.2811 | 0.1078 | 0.2243 | 0.2241 | 30.18 | 0.0604 | 0.1 | 0.8582 | 0.8673 | 1059.0 | 1221.0 | 0.8626 | 0.8664 | 0.8644 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | 3.1037 | 6.0 | 42 | 2.9042 | 0.2603 | 0.0972 | 0.2127 | 0.214 | 30.46 | 0.0578 | 0.0951 | 0.8563 | 0.8657 | 1057.0 | 1221.0 | 0.8596 | 0.8645 | 0.8619 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | 2.8939 | 7.0 | 49 | 2.7615 | 0.2886 | 0.1113 | 0.2305 | 0.2323 | 30.46 | 0.066 | 0.1068 | 0.8591 | 0.8681 | 1060.0 | 1221.0 | 0.8644 | 0.869 | 0.8666 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | 2.7372 | 8.0 | 56 | 2.5912 | 0.3566 | 0.1691 | 0.2984 | 0.2992 | 30.22 | 0.0939 | 0.139 | 0.8976 | 0.9025 | 1102.0 | 1221.0 | 0.8775 | 0.8789 | 0.8781 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | 2.5953 | 9.0 | 63 | 2.4377 | 0.3799 | 0.1999 | 0.3263 | 0.3303 | 30.04 | 0.117 | 0.1637 | 0.8785 | 0.8853 | 1081.0 | 1221.0 | 0.8834 | 0.8819 | 0.8826 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | 2.5022 | 10.0 | 70 | 2.3630 | 0.3861 | 0.2049 | 0.3339 | 0.3363 | 29.74 | 0.1229 | 0.1726 | 0.8619 | 0.8706 | 1063.0 | 1221.0 | 0.8872 | 0.8836 | 0.8853 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
 
84
 
85
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 0.8610
26
+ - Rouge1: 0.4804
27
+ - Rouge2: 0.2605
28
+ - Rougel: 0.4126
29
+ - Rougelsum: 0.4141
30
+ - Gen Len: 28.18
31
+ - Bleu: 0.1536
32
+ - Precisions: 0.2428
33
+ - Brevity Penalty: 0.772
34
+ - Length Ratio: 0.7944
35
+ - Translation Length: 970.0
36
  - Reference Length: 1221.0
37
+ - Precision: 0.914
38
+ - Recall: 0.9024
39
+ - F1: 0.9081
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
+ - learning_rate: 0.002
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
 
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | 20.3106 | 1.0 | 7 | 4.7803 | 0.0666 | 0.0125 | 0.0566 | 0.057 | 31.0 | 0.0062 | 0.0248 | 0.5139 | 0.6003 | 733.0 | 1221.0 | 0.7656 | 0.817 | 0.79 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | 6.3299 | 2.0 | 14 | 4.0381 | 0.3252 | 0.1232 | 0.2298 | 0.2295 | 30.3 | 0.0656 | 0.1136 | 0.8066 | 0.8231 | 1005.0 | 1221.0 | 0.8607 | 0.8665 | 0.8635 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | 3.881 | 3.0 | 21 | 3.2332 | 0.3357 | 0.141 | 0.2638 | 0.2643 | 28.78 | 0.0835 | 0.1391 | 0.8086 | 0.8247 | 1007.0 | 1221.0 | 0.8722 | 0.8719 | 0.872 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | 3.138 | 4.0 | 28 | 2.8019 | 0.3883 | 0.1806 | 0.3285 | 0.3283 | 29.14 | 0.0964 | 0.1631 | 0.7978 | 0.8157 | 996.0 | 1221.0 | 0.8856 | 0.8835 | 0.8845 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | 2.6873 | 5.0 | 35 | 2.2161 | 0.452 | 0.2271 | 0.3854 | 0.3859 | 27.96 | 0.1276 | 0.2114 | 0.781 | 0.8018 | 979.0 | 1221.0 | 0.9067 | 0.8967 | 0.9016 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | 2.0184 | 6.0 | 42 | 1.3080 | 0.463 | 0.2487 | 0.4009 | 0.4028 | 27.62 | 0.1481 | 0.239 | 0.764 | 0.7879 | 962.0 | 1221.0 | 0.9111 | 0.8991 | 0.9049 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | 1.3413 | 7.0 | 49 | 0.9692 | 0.4678 | 0.2529 | 0.401 | 0.4025 | 28.06 | 0.1473 | 0.2354 | 0.773 | 0.7952 | 971.0 | 1221.0 | 0.9109 | 0.8996 | 0.9051 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | 1.0888 | 8.0 | 56 | 0.8996 | 0.4784 | 0.259 | 0.4102 | 0.4118 | 28.2 | 0.1468 | 0.2363 | 0.775 | 0.7969 | 973.0 | 1221.0 | 0.9126 | 0.9013 | 0.9068 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | 0.9722 | 9.0 | 63 | 0.8690 | 0.4824 | 0.262 | 0.4112 | 0.4129 | 28.22 | 0.1523 | 0.2416 | 0.776 | 0.7977 | 974.0 | 1221.0 | 0.9131 | 0.9019 | 0.9074 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | 0.948 | 10.0 | 70 | 0.8610 | 0.4804 | 0.2605 | 0.4126 | 0.4141 | 28.18 | 0.1536 | 0.2428 | 0.772 | 0.7944 | 970.0 | 1221.0 | 0.914 | 0.9024 | 0.9081 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
 
84
 
85
  ### Framework versions
adapter_config.json CHANGED
@@ -24,10 +24,10 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
27
  "k",
28
- "v",
29
  "o",
30
- "q"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
+ "q",
28
  "k",
 
29
  "o",
30
+ "v"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db781698ae527c0bb35c4600d8510586f92f697509247ae20cd4ff78e1d903a4
3
  size 7119264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38ce111c0f210f12c0ccb66428ae651415769d497d0b50bc5aeb23ab93931e64
3
  size 7119264
runs/Jul29_12-28-33_tardis/events.out.tfevents.1753784915.tardis.18953.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:853c4a02140b9e1ed3362e3354acaa2a26ca9f37b8fa59f999892a7cfe1a9cc5
3
+ size 19102
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9948e0e261830d558feb0bf6812d01dd8ea0ebcfa09767a67f99d5fa988af5a7
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b08c96b95d560acd5fd159cc0068dd4ae4af01bf3e57c9f69a8618fc6e6433dc
3
  size 5905