End of training
Browse files- README.md +25 -27
- model.safetensors +1 -1
- runs/Jul30_11-16-40_tardis/events.out.tfevents.1753867002.tardis.40732.0 +3 -0
- tokenizer.json +2 -16
- training_args.bin +1 -1
README.md
CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
22 |
|
23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
24 |
It achieves the following results on the evaluation set:
|
25 |
-
- Loss:
|
26 |
-
- Rouge1: 0.
|
27 |
-
- Rouge2: 0.
|
28 |
-
- Rougel: 0.
|
29 |
-
- Rougelsum: 0.
|
30 |
-
- Gen Len:
|
31 |
-
- Bleu: 0.
|
32 |
-
- Precisions: 0.
|
33 |
-
- Brevity Penalty: 0.
|
34 |
-
- Length Ratio: 0.
|
35 |
-
- Translation Length:
|
36 |
- Reference Length: 1172.0
|
37 |
-
- Precision: 0.
|
38 |
-
- Recall: 0.
|
39 |
-
- F1: 0.
|
40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
41 |
|
42 |
## Model description
|
@@ -64,24 +64,22 @@ The following hyperparameters were used during training:
|
|
64 |
- total_train_batch_size: 16
|
65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
66 |
- lr_scheduler_type: linear
|
67 |
-
- num_epochs:
|
68 |
|
69 |
### Training results
|
70 |
|
71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
|
|
78 |
-
|
|
79 |
-
|
|
80 |
-
|
|
81 |
-
|
|
82 |
-
|
|
83 |
-
| No log | 11.0 | 77 | 8.7598 | 0.1677 | 0.0349 | 0.1332 | 0.1336 | 20.0 | 0.0 | 0.0487 | 0.5626 | 0.6348 | 744.0 | 1172.0 | 0.8465 | 0.8461 | 0.8462 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
84 |
-
| No log | 12.0 | 84 | 8.0950 | 0.1575 | 0.0292 | 0.1237 | 0.1245 | 20.0 | 0.0 | 0.0471 | 0.541 | 0.6195 | 726.0 | 1172.0 | 0.844 | 0.8447 | 0.8442 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
85 |
|
86 |
|
87 |
### Framework versions
|
|
|
22 |
|
23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
24 |
It achieves the following results on the evaluation set:
|
25 |
+
- Loss: 16.6670
|
26 |
+
- Rouge1: 0.179
|
27 |
+
- Rouge2: 0.0376
|
28 |
+
- Rougel: 0.1311
|
29 |
+
- Rougelsum: 0.1303
|
30 |
+
- Gen Len: 31.0
|
31 |
+
- Bleu: 0.0183
|
32 |
+
- Precisions: 0.0456
|
33 |
+
- Brevity Penalty: 0.952
|
34 |
+
- Length Ratio: 0.9531
|
35 |
+
- Translation Length: 1117.0
|
36 |
- Reference Length: 1172.0
|
37 |
+
- Precision: 0.839
|
38 |
+
- Recall: 0.8508
|
39 |
+
- F1: 0.8447
|
40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
41 |
|
42 |
## Model description
|
|
|
64 |
- total_train_batch_size: 16
|
65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
66 |
- lr_scheduler_type: linear
|
67 |
+
- num_epochs: 10
|
68 |
|
69 |
### Training results
|
70 |
|
71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
73 |
+
| 26.8047 | 1.0 | 7 | 28.1961 | 0.1991 | 0.0452 | 0.1451 | 0.1451 | 31.0 | 0.0226 | 0.0516 | 0.9697 | 0.9701 | 1137.0 | 1172.0 | 0.8417 | 0.854 | 0.8477 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
74 |
+
| 24.4577 | 2.0 | 14 | 25.5517 | 0.1938 | 0.0452 | 0.1398 | 0.14 | 31.0 | 0.0225 | 0.0511 | 0.9617 | 0.9625 | 1128.0 | 1172.0 | 0.8404 | 0.8526 | 0.8464 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
75 |
+
| 22.7574 | 3.0 | 21 | 23.4583 | 0.1872 | 0.0412 | 0.1373 | 0.1368 | 31.0 | 0.0218 | 0.0488 | 0.9582 | 0.959 | 1124.0 | 1172.0 | 0.8393 | 0.8517 | 0.8454 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
76 |
+
| 21.0685 | 4.0 | 28 | 21.7844 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
77 |
+
| 20.037 | 5.0 | 35 | 20.3738 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
78 |
+
| 18.9602 | 6.0 | 42 | 19.1832 | 0.1862 | 0.0397 | 0.1322 | 0.1313 | 31.0 | 0.0194 | 0.0484 | 0.9546 | 0.9556 | 1120.0 | 1172.0 | 0.8396 | 0.8512 | 0.8453 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
79 |
+
| 18.0035 | 7.0 | 49 | 18.1774 | 0.1853 | 0.0397 | 0.1321 | 0.1313 | 31.0 | 0.0194 | 0.0482 | 0.9555 | 0.9565 | 1121.0 | 1172.0 | 0.8394 | 0.8511 | 0.8451 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
80 |
+
| 17.2636 | 8.0 | 56 | 17.4050 | 0.1862 | 0.0396 | 0.1328 | 0.132 | 31.0 | 0.0195 | 0.0478 | 0.9493 | 0.9505 | 1114.0 | 1172.0 | 0.8389 | 0.8507 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
81 |
+
| 16.5914 | 9.0 | 63 | 16.8775 | 0.179 | 0.0376 | 0.1302 | 0.1293 | 31.0 | 0.019 | 0.046 | 0.9502 | 0.9514 | 1115.0 | 1172.0 | 0.8384 | 0.8504 | 0.8443 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
82 |
+
| 16.2789 | 10.0 | 70 | 16.6670 | 0.179 | 0.0376 | 0.1311 | 0.1303 | 31.0 | 0.0183 | 0.0456 | 0.952 | 0.9531 | 1117.0 | 1172.0 | 0.839 | 0.8508 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
|
|
|
|
83 |
|
84 |
|
85 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1187780840
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:407badce2d5d5a795a27cfcab355a116b225f7a2d47632efe0ac8123f4e28e05
|
3 |
size 1187780840
|
runs/Jul30_11-16-40_tardis/events.out.tfevents.1753867002.tardis.40732.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1a930f29132b5f2bb23cc86884f482dba64e7abff03e8dca5e98d16c8fe3367a
|
3 |
+
size 19084
|
tokenizer.json
CHANGED
@@ -1,21 +1,7 @@
|
|
1 |
{
|
2 |
"version": "1.0",
|
3 |
-
"truncation":
|
4 |
-
|
5 |
-
"max_length": 64,
|
6 |
-
"strategy": "LongestFirst",
|
7 |
-
"stride": 0
|
8 |
-
},
|
9 |
-
"padding": {
|
10 |
-
"strategy": {
|
11 |
-
"Fixed": 64
|
12 |
-
},
|
13 |
-
"direction": "Right",
|
14 |
-
"pad_to_multiple_of": null,
|
15 |
-
"pad_id": 0,
|
16 |
-
"pad_type_id": 0,
|
17 |
-
"pad_token": "<pad>"
|
18 |
-
},
|
19 |
"added_tokens": [
|
20 |
{
|
21 |
"id": 0,
|
|
|
1 |
{
|
2 |
"version": "1.0",
|
3 |
+
"truncation": null,
|
4 |
+
"padding": null,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
"added_tokens": [
|
6 |
{
|
7 |
"id": 0,
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5905
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:990710b9302c0c8b77c901eccfa5bc774039bdf414139adde64d79d3abef9639
|
3 |
size 5905
|