End of training
Browse files- README.md +26 -31
- adapter_config.json +3 -3
- adapter_model.safetensors +1 -1
- runs/Jul29_12-11-53_tardis/events.out.tfevents.1753783914.tardis.18756.0 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
22 |
|
23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
24 |
It achieves the following results on the evaluation set:
|
25 |
-
- Loss: 1.
|
26 |
-
- Rouge1: 0.
|
27 |
-
- Rouge2: 0.
|
28 |
-
- Rougel: 0.
|
29 |
-
- Rougelsum: 0.
|
30 |
-
- Gen Len: 28.
|
31 |
-
- Bleu: 0.
|
32 |
-
- Precisions: 0.
|
33 |
-
- Brevity Penalty: 0.
|
34 |
-
- Length Ratio: 0.
|
35 |
-
- Translation Length:
|
36 |
- Reference Length: 1208.0
|
37 |
-
- Precision: 0.
|
38 |
-
- Recall: 0.
|
39 |
-
- F1: 0.
|
40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
41 |
|
42 |
## Model description
|
@@ -56,7 +56,7 @@ More information needed
|
|
56 |
### Training hyperparameters
|
57 |
|
58 |
The following hyperparameters were used during training:
|
59 |
-
- learning_rate: 0.
|
60 |
- train_batch_size: 1
|
61 |
- eval_batch_size: 1
|
62 |
- seed: 42
|
@@ -64,27 +64,22 @@ The following hyperparameters were used during training:
|
|
64 |
- total_train_batch_size: 16
|
65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
66 |
- lr_scheduler_type: linear
|
67 |
-
- num_epochs:
|
68 |
|
69 |
### Training results
|
70 |
|
71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
|
|
78 |
-
|
|
79 |
-
| 1.
|
80 |
-
| 1.
|
81 |
-
|
|
82 |
-
|
|
83 |
-
| 0.9097 | 11.0 | 77 | 1.1998 | 0.3319 | 0.1244 | 0.2529 | 0.2531 | 28.32 | 0.0605 | 0.1192 | 0.8537 | 0.8634 | 1043.0 | 1208.0 | 0.8818 | 0.8773 | 0.8795 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
84 |
-
| 0.8865 | 12.0 | 84 | 1.1992 | 0.3194 | 0.1101 | 0.2499 | 0.2505 | 28.52 | 0.0597 | 0.1145 | 0.8584 | 0.8675 | 1048.0 | 1208.0 | 0.8789 | 0.8754 | 0.8771 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
85 |
-
| 0.8648 | 13.0 | 91 | 1.1958 | 0.3326 | 0.122 | 0.2536 | 0.253 | 28.5 | 0.065 | 0.1213 | 0.865 | 0.8733 | 1055.0 | 1208.0 | 0.8826 | 0.8774 | 0.8799 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
86 |
-
| 0.85 | 14.0 | 98 | 1.1967 | 0.3343 | 0.1233 | 0.2544 | 0.254 | 28.36 | 0.0627 | 0.1211 | 0.8432 | 0.8543 | 1032.0 | 1208.0 | 0.8814 | 0.8773 | 0.8793 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
87 |
-
| 0.8362 | 15.0 | 105 | 1.1993 | 0.3322 | 0.125 | 0.2565 | 0.2574 | 28.14 | 0.0621 | 0.1225 | 0.8355 | 0.8477 | 1024.0 | 1208.0 | 0.8829 | 0.878 | 0.8804 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
88 |
|
89 |
|
90 |
### Framework versions
|
|
|
22 |
|
23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
24 |
It achieves the following results on the evaluation set:
|
25 |
+
- Loss: 1.2142
|
26 |
+
- Rouge1: 0.2852
|
27 |
+
- Rouge2: 0.0966
|
28 |
+
- Rougel: 0.2231
|
29 |
+
- Rougelsum: 0.2243
|
30 |
+
- Gen Len: 28.38
|
31 |
+
- Bleu: 0.0405
|
32 |
+
- Precisions: 0.0919
|
33 |
+
- Brevity Penalty: 0.8771
|
34 |
+
- Length Ratio: 0.8841
|
35 |
+
- Translation Length: 1068.0
|
36 |
- Reference Length: 1208.0
|
37 |
+
- Precision: 0.8739
|
38 |
+
- Recall: 0.8718
|
39 |
+
- F1: 0.8728
|
40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
41 |
|
42 |
## Model description
|
|
|
56 |
### Training hyperparameters
|
57 |
|
58 |
The following hyperparameters were used during training:
|
59 |
+
- learning_rate: 0.002
|
60 |
- train_batch_size: 1
|
61 |
- eval_batch_size: 1
|
62 |
- seed: 42
|
|
|
64 |
- total_train_batch_size: 16
|
65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
66 |
- lr_scheduler_type: linear
|
67 |
+
- num_epochs: 10
|
68 |
|
69 |
### Training results
|
70 |
|
71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
73 |
+
| 22.2581 | 1.0 | 7 | 6.5353 | 0.084 | 0.0147 | 0.0714 | 0.0714 | 31.0 | 0.0047 | 0.0247 | 0.5558 | 0.63 | 761.0 | 1208.0 | 0.7817 | 0.8234 | 0.8014 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
74 |
+
| 6.7792 | 2.0 | 14 | 5.1759 | 0.1642 | 0.0129 | 0.13 | 0.1296 | 30.46 | 0.0 | 0.044 | 0.755 | 0.7806 | 943.0 | 1208.0 | 0.8343 | 0.8356 | 0.8349 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
75 |
+
| 4.5124 | 3.0 | 21 | 3.7445 | 0.2094 | 0.0517 | 0.1669 | 0.1666 | 28.9 | 0.021 | 0.0606 | 0.8336 | 0.846 | 1022.0 | 1208.0 | 0.8516 | 0.8529 | 0.8521 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
76 |
+
| 3.5042 | 4.0 | 28 | 3.1497 | 0.2314 | 0.0579 | 0.1774 | 0.1772 | 29.1 | 0.0317 | 0.0716 | 0.8537 | 0.8634 | 1043.0 | 1208.0 | 0.855 | 0.8584 | 0.8567 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
77 |
+
| 2.8574 | 5.0 | 35 | 2.0950 | 0.2342 | 0.0664 | 0.1895 | 0.1897 | 28.34 | 0.0325 | 0.0756 | 0.8584 | 0.8675 | 1048.0 | 1208.0 | 0.8581 | 0.8605 | 0.8593 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
78 |
+
| 2.0046 | 6.0 | 42 | 1.4599 | 0.2643 | 0.0843 | 0.2074 | 0.2081 | 28.18 | 0.036 | 0.0853 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.8665 | 0.8652 | 0.8658 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
79 |
+
| 1.4948 | 7.0 | 49 | 1.2786 | 0.2831 | 0.0921 | 0.2203 | 0.2208 | 28.3 | 0.0413 | 0.0893 | 0.8855 | 0.8916 | 1077.0 | 1208.0 | 0.8703 | 0.8681 | 0.8691 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
80 |
+
| 1.2731 | 8.0 | 56 | 1.2338 | 0.2802 | 0.096 | 0.2204 | 0.2221 | 28.26 | 0.0406 | 0.0893 | 0.8753 | 0.8825 | 1066.0 | 1208.0 | 0.8729 | 0.8705 | 0.8717 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
81 |
+
| 1.1977 | 9.0 | 63 | 1.2179 | 0.2834 | 0.0991 | 0.2233 | 0.2244 | 28.42 | 0.0409 | 0.0919 | 0.8725 | 0.88 | 1063.0 | 1208.0 | 0.8745 | 0.8722 | 0.8733 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
82 |
+
| 1.1717 | 10.0 | 70 | 1.2142 | 0.2852 | 0.0966 | 0.2231 | 0.2243 | 28.38 | 0.0405 | 0.0919 | 0.8771 | 0.8841 | 1068.0 | 1208.0 | 0.8739 | 0.8718 | 0.8728 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
|
|
|
|
|
|
|
|
|
|
83 |
|
84 |
|
85 |
### Framework versions
|
adapter_config.json
CHANGED
@@ -24,10 +24,10 @@
|
|
24 |
"rank_pattern": {},
|
25 |
"revision": null,
|
26 |
"target_modules": [
|
27 |
-
"v",
|
28 |
-
"q",
|
29 |
"k",
|
30 |
-
"o"
|
|
|
|
|
31 |
],
|
32 |
"task_type": "SEQ_2_SEQ_LM",
|
33 |
"trainable_token_indices": null,
|
|
|
24 |
"rank_pattern": {},
|
25 |
"revision": null,
|
26 |
"target_modules": [
|
|
|
|
|
27 |
"k",
|
28 |
+
"o",
|
29 |
+
"v",
|
30 |
+
"q"
|
31 |
],
|
32 |
"task_type": "SEQ_2_SEQ_LM",
|
33 |
"trainable_token_indices": null,
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 7119264
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ce2b38ef30b963d66e9ef85b5767a7582dd6d73c56fe4d698cc823db9c557e96
|
3 |
size 7119264
|
runs/Jul29_12-11-53_tardis/events.out.tfevents.1753783914.tardis.18756.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:81b1cf0457fb01b02808edbef452230b35362d1be9eee6b61813f820191eb47d
|
3 |
+
size 19105
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5905
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5272855a855b3568b6450f7b43bc4ab84b1cf3c36b67fb9ef04d9a9690734150
|
3 |
size 5905
|