floflodebilbao commited on
Commit
07692f7
·
verified ·
1 Parent(s): 785fcc2

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 2.6289
26
- - Rouge1: 0.2499
27
- - Rouge2: 0.0711
28
- - Rougel: 0.1874
29
- - Rougelsum: 0.1847
30
- - Gen Len: 30.0
31
- - Bleu: 0.0421
32
- - Precisions: 0.0818
33
- - Brevity Penalty: 0.8856
34
- - Length Ratio: 0.8916
35
- - Translation Length: 1045.0
36
  - Reference Length: 1172.0
37
- - Precision: 0.8573
38
- - Recall: 0.8636
39
- - F1: 0.8603
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -56,7 +56,7 @@ More information needed
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
- - learning_rate: 0.001
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
@@ -70,16 +70,16 @@ The following hyperparameters were used during training:
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | 25.0427 | 1.0 | 7 | 20.2839 | 0.206 | 0.0438 | 0.1455 | 0.1459 | 31.0 | 0.0209 | 0.0515 | 0.9688 | 0.9693 | 1136.0 | 1172.0 | 0.8414 | 0.8536 | 0.8474 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | 14.8433 | 2.0 | 14 | 5.7742 | 0.0096 | 0.0 | 0.0091 | 0.0088 | 31.0 | 0.0 | 0.0038 | 0.2151 | 0.3942 | 462.0 | 1172.0 | 0.6945 | 0.8131 | 0.7489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | 6.195 | 3.0 | 21 | 4.7370 | 0.1708 | 0.0475 | 0.1371 | 0.1366 | 30.98 | 0.03 | 0.0575 | 0.8261 | 0.8396 | 984.0 | 1172.0 | 0.8124 | 0.8408 | 0.826 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | 4.5548 | 4.0 | 28 | 3.8942 | 0.1499 | 0.024 | 0.1088 | 0.1096 | 30.7 | 0.0 | 0.0359 | 0.8865 | 0.8925 | 1046.0 | 1172.0 | 0.8277 | 0.8465 | 0.8369 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | 3.8704 | 5.0 | 35 | 3.3844 | 0.2031 | 0.06 | 0.1453 | 0.145 | 30.94 | 0.0377 | 0.0644 | 0.9035 | 0.9078 | 1064.0 | 1172.0 | 0.8362 | 0.8539 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | 3.4394 | 6.0 | 42 | 3.2171 | 0.2063 | 0.0503 | 0.1522 | 0.1524 | 30.6 | 0.0251 | 0.061 | 0.8789 | 0.8857 | 1038.0 | 1172.0 | 0.8468 | 0.8567 | 0.8516 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | 3.2279 | 7.0 | 49 | 3.0629 | 0.2333 | 0.0654 | 0.1741 | 0.1724 | 30.72 | 0.0342 | 0.0719 | 0.9007 | 0.9053 | 1061.0 | 1172.0 | 0.8518 | 0.8612 | 0.8564 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | 3.0176 | 8.0 | 56 | 2.8782 | 0.2474 | 0.0709 | 0.1874 | 0.1856 | 30.04 | 0.0396 | 0.0784 | 0.8837 | 0.8899 | 1043.0 | 1172.0 | 0.857 | 0.8633 | 0.86 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | 2.867 | 9.0 | 63 | 2.7092 | 0.2416 | 0.0673 | 0.1809 | 0.1782 | 30.12 | 0.0401 | 0.078 | 0.8865 | 0.8925 | 1046.0 | 1172.0 | 0.8548 | 0.8621 | 0.8583 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | 2.7711 | 10.0 | 70 | 2.6289 | 0.2499 | 0.0711 | 0.1874 | 0.1847 | 30.0 | 0.0421 | 0.0818 | 0.8856 | 0.8916 | 1045.0 | 1172.0 | 0.8573 | 0.8636 | 0.8603 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
 
84
 
85
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 1.1051
26
+ - Rouge1: 0.3789
27
+ - Rouge2: 0.1817
28
+ - Rougel: 0.3238
29
+ - Rougelsum: 0.3256
30
+ - Gen Len: 27.8
31
+ - Bleu: 0.0865
32
+ - Precisions: 0.1534
33
+ - Brevity Penalty: 0.8221
34
+ - Length Ratio: 0.8362
35
+ - Translation Length: 980.0
36
  - Reference Length: 1172.0
37
+ - Precision: 0.8937
38
+ - Recall: 0.8862
39
+ - F1: 0.8898
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
+ - learning_rate: 0.002
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
 
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | 22.0917 | 1.0 | 7 | 5.3855 | 0.0468 | 0.0056 | 0.0416 | 0.0415 | 31.0 | 0.0 | 0.016 | 0.5803 | 0.6476 | 759.0 | 1172.0 | 0.7506 | 0.8197 | 0.7828 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | 6.5733 | 2.0 | 14 | 4.6730 | 0.1909 | 0.0287 | 0.1473 | 0.1475 | 30.88 | 0.0179 | 0.0488 | 0.8856 | 0.8916 | 1045.0 | 1172.0 | 0.8418 | 0.8462 | 0.844 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | 4.2163 | 3.0 | 21 | 3.6942 | 0.2295 | 0.0424 | 0.1634 | 0.1642 | 29.08 | 0.0264 | 0.0695 | 0.8469 | 0.8575 | 1005.0 | 1172.0 | 0.8546 | 0.8582 | 0.8563 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | 3.5683 | 4.0 | 28 | 3.1688 | 0.2805 | 0.0846 | 0.2121 | 0.2134 | 28.98 | 0.0383 | 0.0906 | 0.8469 | 0.8575 | 1005.0 | 1172.0 | 0.8681 | 0.8666 | 0.8672 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | 3.2672 | 5.0 | 35 | 2.8633 | 0.325 | 0.1351 | 0.2652 | 0.2669 | 28.4 | 0.0652 | 0.1242 | 0.8341 | 0.8464 | 992.0 | 1172.0 | 0.8823 | 0.8776 | 0.8799 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | 3.0146 | 6.0 | 42 | 2.4207 | 0.3326 | 0.1431 | 0.2839 | 0.2856 | 28.08 | 0.0788 | 0.1344 | 0.839 | 0.8507 | 997.0 | 1172.0 | 0.8839 | 0.879 | 0.8813 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | 2.4539 | 7.0 | 49 | 1.7916 | 0.3471 | 0.1565 | 0.2932 | 0.2931 | 28.26 | 0.0882 | 0.1431 | 0.839 | 0.8507 | 997.0 | 1172.0 | 0.8863 | 0.882 | 0.884 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | 1.965 | 8.0 | 56 | 1.3215 | 0.3607 | 0.1749 | 0.3113 | 0.3125 | 28.18 | 0.0925 | 0.1498 | 0.8331 | 0.8456 | 991.0 | 1172.0 | 0.889 | 0.8839 | 0.8863 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | 1.7658 | 9.0 | 63 | 1.1630 | 0.3772 | 0.1782 | 0.3211 | 0.3228 | 27.8 | 0.0838 | 0.1518 | 0.813 | 0.8285 | 971.0 | 1172.0 | 0.8937 | 0.8859 | 0.8897 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | 1.5019 | 10.0 | 70 | 1.1051 | 0.3789 | 0.1817 | 0.3238 | 0.3256 | 27.8 | 0.0865 | 0.1534 | 0.8221 | 0.8362 | 980.0 | 1172.0 | 0.8937 | 0.8862 | 0.8898 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
 
84
 
85
  ### Framework versions
adapter_config.json CHANGED
@@ -25,9 +25,9 @@
25
  "revision": null,
26
  "target_modules": [
27
  "q",
28
- "k",
29
  "v",
30
- "o"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
 
25
  "revision": null,
26
  "target_modules": [
27
  "q",
28
+ "o",
29
  "v",
30
+ "k"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6b969bf2e3a4dec98ae154f5c19f18662bf2dd9f0ad6ef9ef4ca0cb104cdaf22
3
  size 7119264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e07eb41c018e5a6cbf405a58ee6a44f58dc54da56044f2e12c2b0445a1991cb
3
  size 7119264
runs/Jul29_12-44-51_tardis/events.out.tfevents.1753785892.tardis.19164.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84e90b47bd9ece533cc352435c70fa025133c1a4285e6938482fadb8459f6206
3
+ size 19099
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 64,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 64
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 0,
16
- "pad_type_id": 0,
17
- "pad_token": "<pad>"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6cad30856bcc8c3655eecccb7fd43ad4e3e6468899034aa434ada8723551ee63
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:decce348be836db1bdce1f73e6b404c70ec3171d91e1afd44b9700dcad3c689d
3
  size 5905