floflodebilbao commited on
Commit
e573340
·
verified ·
1 Parent(s): 676d2f9

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 8.0950
26
- - Rouge1: 0.1575
27
- - Rouge2: 0.0292
28
- - Rougel: 0.1237
29
- - Rougelsum: 0.1245
30
- - Gen Len: 20.0
31
- - Bleu: 0.0
32
- - Precisions: 0.0471
33
- - Brevity Penalty: 0.541
34
- - Length Ratio: 0.6195
35
- - Translation Length: 726.0
36
  - Reference Length: 1172.0
37
- - Precision: 0.844
38
- - Recall: 0.8447
39
- - F1: 0.8442
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -64,24 +64,22 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 12
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | No log | 1.0 | 7 | 27.8202 | 0.1616 | 0.0383 | 0.1312 | 0.1311 | 20.0 | 0.0092 | 0.05 | 0.5732 | 0.6425 | 753.0 | 1172.0 | 0.8481 | 0.8477 | 0.8478 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | No log | 2.0 | 14 | 25.1113 | 0.1577 | 0.0373 | 0.1267 | 0.1275 | 20.0 | 0.0089 | 0.0487 | 0.5614 | 0.634 | 743.0 | 1172.0 | 0.8471 | 0.8468 | 0.8469 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | No log | 3.0 | 21 | 22.9015 | 0.1672 | 0.0424 | 0.1308 | 0.1308 | 20.0 | 0.0 | 0.0511 | 0.5756 | 0.6442 | 755.0 | 1172.0 | 0.8508 | 0.8483 | 0.8495 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | No log | 4.0 | 28 | 21.1727 | 0.1656 | 0.0414 | 0.128 | 0.1285 | 20.0 | 0.0 | 0.0499 | 0.5673 | 0.6382 | 748.0 | 1172.0 | 0.8509 | 0.8479 | 0.8493 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | No log | 5.0 | 35 | 19.5717 | 0.1659 | 0.0389 | 0.1267 | 0.1276 | 20.0 | 0.0 | 0.0494 | 0.5697 | 0.6399 | 750.0 | 1172.0 | 0.8505 | 0.8475 | 0.8489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | No log | 6.0 | 42 | 18.0112 | 0.167 | 0.0389 | 0.1267 | 0.1275 | 20.0 | 0.0 | 0.0497 | 0.5697 | 0.6399 | 750.0 | 1172.0 | 0.8506 | 0.8473 | 0.8489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | No log | 7.0 | 49 | 16.3808 | 0.1567 | 0.0318 | 0.1214 | 0.1214 | 20.0 | 0.0 | 0.0447 | 0.5661 | 0.6374 | 747.0 | 1172.0 | 0.8483 | 0.8457 | 0.8469 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | No log | 8.0 | 56 | 14.5603 | 0.1563 | 0.0338 | 0.1247 | 0.1251 | 20.0 | 0.0 | 0.0467 | 0.5626 | 0.6348 | 744.0 | 1172.0 | 0.8455 | 0.8441 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | No log | 9.0 | 63 | 12.5254 | 0.1626 | 0.0344 | 0.1303 | 0.1301 | 20.0 | 0.0 | 0.0471 | 0.5661 | 0.6374 | 747.0 | 1172.0 | 0.8479 | 0.8453 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | No log | 10.0 | 70 | 10.4265 | 0.1641 | 0.0326 | 0.1319 | 0.1323 | 20.0 | 0.0 | 0.0445 | 0.5697 | 0.6399 | 750.0 | 1172.0 | 0.8463 | 0.8455 | 0.8458 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
- | No log | 11.0 | 77 | 8.7598 | 0.1677 | 0.0349 | 0.1332 | 0.1336 | 20.0 | 0.0 | 0.0487 | 0.5626 | 0.6348 | 744.0 | 1172.0 | 0.8465 | 0.8461 | 0.8462 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
- | No log | 12.0 | 84 | 8.0950 | 0.1575 | 0.0292 | 0.1237 | 0.1245 | 20.0 | 0.0 | 0.0471 | 0.541 | 0.6195 | 726.0 | 1172.0 | 0.844 | 0.8447 | 0.8442 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
 
86
 
87
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 16.6670
26
+ - Rouge1: 0.179
27
+ - Rouge2: 0.0376
28
+ - Rougel: 0.1311
29
+ - Rougelsum: 0.1303
30
+ - Gen Len: 31.0
31
+ - Bleu: 0.0183
32
+ - Precisions: 0.0456
33
+ - Brevity Penalty: 0.952
34
+ - Length Ratio: 0.9531
35
+ - Translation Length: 1117.0
36
  - Reference Length: 1172.0
37
+ - Precision: 0.839
38
+ - Recall: 0.8508
39
+ - F1: 0.8447
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 10
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | 26.8047 | 1.0 | 7 | 28.1961 | 0.1991 | 0.0452 | 0.1451 | 0.1451 | 31.0 | 0.0226 | 0.0516 | 0.9697 | 0.9701 | 1137.0 | 1172.0 | 0.8417 | 0.854 | 0.8477 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | 24.4577 | 2.0 | 14 | 25.5517 | 0.1938 | 0.0452 | 0.1398 | 0.14 | 31.0 | 0.0225 | 0.0511 | 0.9617 | 0.9625 | 1128.0 | 1172.0 | 0.8404 | 0.8526 | 0.8464 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | 22.7574 | 3.0 | 21 | 23.4583 | 0.1872 | 0.0412 | 0.1373 | 0.1368 | 31.0 | 0.0218 | 0.0488 | 0.9582 | 0.959 | 1124.0 | 1172.0 | 0.8393 | 0.8517 | 0.8454 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | 21.0685 | 4.0 | 28 | 21.7844 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | 20.037 | 5.0 | 35 | 20.3738 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | 18.9602 | 6.0 | 42 | 19.1832 | 0.1862 | 0.0397 | 0.1322 | 0.1313 | 31.0 | 0.0194 | 0.0484 | 0.9546 | 0.9556 | 1120.0 | 1172.0 | 0.8396 | 0.8512 | 0.8453 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | 18.0035 | 7.0 | 49 | 18.1774 | 0.1853 | 0.0397 | 0.1321 | 0.1313 | 31.0 | 0.0194 | 0.0482 | 0.9555 | 0.9565 | 1121.0 | 1172.0 | 0.8394 | 0.8511 | 0.8451 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | 17.2636 | 8.0 | 56 | 17.4050 | 0.1862 | 0.0396 | 0.1328 | 0.132 | 31.0 | 0.0195 | 0.0478 | 0.9493 | 0.9505 | 1114.0 | 1172.0 | 0.8389 | 0.8507 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | 16.5914 | 9.0 | 63 | 16.8775 | 0.179 | 0.0376 | 0.1302 | 0.1293 | 31.0 | 0.019 | 0.046 | 0.9502 | 0.9514 | 1115.0 | 1172.0 | 0.8384 | 0.8504 | 0.8443 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | 16.2789 | 10.0 | 70 | 16.6670 | 0.179 | 0.0376 | 0.1311 | 0.1303 | 31.0 | 0.0183 | 0.0456 | 0.952 | 0.9531 | 1117.0 | 1172.0 | 0.839 | 0.8508 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 
 
83
 
84
 
85
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a1985050b7eb85158d5672a943fc833f8f5fe590909d4a3d6f39ec9596252a85
3
  size 1187780840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:407badce2d5d5a795a27cfcab355a116b225f7a2d47632efe0ac8123f4e28e05
3
  size 1187780840
runs/Jul30_11-16-40_tardis/events.out.tfevents.1753867002.tardis.40732.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a930f29132b5f2bb23cc86884f482dba64e7abff03e8dca5e98d16c8fe3367a
3
+ size 19084
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 64,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 64
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 0,
16
- "pad_type_id": 0,
17
- "pad_token": "<pad>"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4b3428c37eef80c576fee815ce8a12fb83c37eef768ed41a981af3086b6283ee
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:990710b9302c0c8b77c901eccfa5bc774039bdf414139adde64d79d3abef9639
3
  size 5905