End of training

Browse files

Files changed (5) hide show

README.md +25 -27
model.safetensors +1 -1
runs/Jul30_11-16-40_tardis/events.out.tfevents.1753867002.tardis.40732.0 +3 -0
tokenizer.json +2 -16
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 8.0950
-- Rouge1: 0.1575
-- Rouge2: 0.0292
-- Rougel: 0.1237
-- Rougelsum: 0.1245
-- Gen Len: 20.0
-- Bleu: 0.0
-- Precisions: 0.0471
-- Brevity Penalty: 0.541
-- Length Ratio: 0.6195
-- Translation Length: 726.0
 - Reference Length: 1172.0
-- Precision: 0.844
-- Recall: 0.8447
-- F1: 0.8442
 - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
 ## Model description
@@ -64,24 +64,22 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 12
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
-| No log        | 1.0   | 7    | 27.8202         | 0.1616 | 0.0383 | 0.1312 | 0.1311    | 20.0    | 0.0092 | 0.05       | 0.5732          | 0.6425       | 753.0              | 1172.0           | 0.8481    | 0.8477 | 0.8478 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 2.0   | 14   | 25.1113         | 0.1577 | 0.0373 | 0.1267 | 0.1275    | 20.0    | 0.0089 | 0.0487     | 0.5614          | 0.634        | 743.0              | 1172.0           | 0.8471    | 0.8468 | 0.8469 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 3.0   | 21   | 22.9015         | 0.1672 | 0.0424 | 0.1308 | 0.1308    | 20.0    | 0.0    | 0.0511     | 0.5756          | 0.6442       | 755.0              | 1172.0           | 0.8508    | 0.8483 | 0.8495 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 4.0   | 28   | 21.1727         | 0.1656 | 0.0414 | 0.128  | 0.1285    | 20.0    | 0.0    | 0.0499     | 0.5673          | 0.6382       | 748.0              | 1172.0           | 0.8509    | 0.8479 | 0.8493 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 5.0   | 35   | 19.5717         | 0.1659 | 0.0389 | 0.1267 | 0.1276    | 20.0    | 0.0    | 0.0494     | 0.5697          | 0.6399       | 750.0              | 1172.0           | 0.8505    | 0.8475 | 0.8489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 6.0   | 42   | 18.0112         | 0.167  | 0.0389 | 0.1267 | 0.1275    | 20.0    | 0.0    | 0.0497     | 0.5697          | 0.6399       | 750.0              | 1172.0           | 0.8506    | 0.8473 | 0.8489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 7.0   | 49   | 16.3808         | 0.1567 | 0.0318 | 0.1214 | 0.1214    | 20.0    | 0.0    | 0.0447     | 0.5661          | 0.6374       | 747.0              | 1172.0           | 0.8483    | 0.8457 | 0.8469 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 8.0   | 56   | 14.5603         | 0.1563 | 0.0338 | 0.1247 | 0.1251    | 20.0    | 0.0    | 0.0467     | 0.5626          | 0.6348       | 744.0              | 1172.0           | 0.8455    | 0.8441 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 9.0   | 63   | 12.5254         | 0.1626 | 0.0344 | 0.1303 | 0.1301    | 20.0    | 0.0    | 0.0471     | 0.5661          | 0.6374       | 747.0              | 1172.0           | 0.8479    | 0.8453 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 10.0  | 70   | 10.4265         | 0.1641 | 0.0326 | 0.1319 | 0.1323    | 20.0    | 0.0    | 0.0445     | 0.5697          | 0.6399       | 750.0              | 1172.0           | 0.8463    | 0.8455 | 0.8458 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 11.0  | 77   | 8.7598          | 0.1677 | 0.0349 | 0.1332 | 0.1336    | 20.0    | 0.0    | 0.0487     | 0.5626          | 0.6348       | 744.0              | 1172.0           | 0.8465    | 0.8461 | 0.8462 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 12.0  | 84   | 8.0950          | 0.1575 | 0.0292 | 0.1237 | 0.1245    | 20.0    | 0.0    | 0.0471     | 0.541           | 0.6195       | 726.0              | 1172.0           | 0.844     | 0.8447 | 0.8442 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 ### Framework versions

 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 16.6670
+- Rouge1: 0.179
+- Rouge2: 0.0376
+- Rougel: 0.1311
+- Rougelsum: 0.1303
+- Gen Len: 31.0
+- Bleu: 0.0183
+- Precisions: 0.0456
+- Brevity Penalty: 0.952
+- Length Ratio: 0.9531
+- Translation Length: 1117.0
 - Reference Length: 1172.0
+- Precision: 0.839
+- Recall: 0.8508
+- F1: 0.8447
 - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
 ## Model description
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
+| 26.8047       | 1.0   | 7    | 28.1961         | 0.1991 | 0.0452 | 0.1451 | 0.1451    | 31.0    | 0.0226 | 0.0516     | 0.9697          | 0.9701       | 1137.0             | 1172.0           | 0.8417    | 0.854  | 0.8477 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 24.4577       | 2.0   | 14   | 25.5517         | 0.1938 | 0.0452 | 0.1398 | 0.14      | 31.0    | 0.0225 | 0.0511     | 0.9617          | 0.9625       | 1128.0             | 1172.0           | 0.8404    | 0.8526 | 0.8464 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 22.7574       | 3.0   | 21   | 23.4583         | 0.1872 | 0.0412 | 0.1373 | 0.1368    | 31.0    | 0.0218 | 0.0488     | 0.9582          | 0.959        | 1124.0             | 1172.0           | 0.8393    | 0.8517 | 0.8454 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 21.0685       | 4.0   | 28   | 21.7844         | 0.1858 | 0.0425 | 0.1358 | 0.1353    | 31.0    | 0.0196 | 0.0485     | 0.9635          | 0.9642       | 1130.0             | 1172.0           | 0.8407    | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 20.037        | 5.0   | 35   | 20.3738         | 0.1858 | 0.0425 | 0.1358 | 0.1353    | 31.0    | 0.0196 | 0.0485     | 0.9635          | 0.9642       | 1130.0             | 1172.0           | 0.8407    | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 18.9602       | 6.0   | 42   | 19.1832         | 0.1862 | 0.0397 | 0.1322 | 0.1313    | 31.0    | 0.0194 | 0.0484     | 0.9546          | 0.9556       | 1120.0             | 1172.0           | 0.8396    | 0.8512 | 0.8453 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 18.0035       | 7.0   | 49   | 18.1774         | 0.1853 | 0.0397 | 0.1321 | 0.1313    | 31.0    | 0.0194 | 0.0482     | 0.9555          | 0.9565       | 1121.0             | 1172.0           | 0.8394    | 0.8511 | 0.8451 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 17.2636       | 8.0   | 56   | 17.4050         | 0.1862 | 0.0396 | 0.1328 | 0.132     | 31.0    | 0.0195 | 0.0478     | 0.9493          | 0.9505       | 1114.0             | 1172.0           | 0.8389    | 0.8507 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 16.5914       | 9.0   | 63   | 16.8775         | 0.179  | 0.0376 | 0.1302 | 0.1293    | 31.0    | 0.019  | 0.046      | 0.9502          | 0.9514       | 1115.0             | 1172.0           | 0.8384    | 0.8504 | 0.8443 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| 16.2789       | 10.0  | 70   | 16.6670         | 0.179  | 0.0376 | 0.1311 | 0.1303    | 31.0    | 0.0183 | 0.0456     | 0.952           | 0.9531       | 1117.0             | 1172.0           | 0.839     | 0.8508 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a1985050b7eb85158d5672a943fc833f8f5fe590909d4a3d6f39ec9596252a85
 size 1187780840

 version https://git-lfs.github.com/spec/v1
+oid sha256:407badce2d5d5a795a27cfcab355a116b225f7a2d47632efe0ac8123f4e28e05
 size 1187780840

runs/Jul30_11-16-40_tardis/events.out.tfevents.1753867002.tardis.40732.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1a930f29132b5f2bb23cc86884f482dba64e7abff03e8dca5e98d16c8fe3367a
+size 19084

tokenizer.json CHANGED Viewed

@@ -1,21 +1,7 @@
 {
   "version": "1.0",
-  "truncation": {
-    "direction": "Right",
-    "max_length": 64,
-    "strategy": "LongestFirst",
-    "stride": 0
-  },
-  "padding": {
-    "strategy": {
-      "Fixed": 64
-    },
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 0,
-    "pad_type_id": 0,
-    "pad_token": "<pad>"
-  },
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": null,
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4b3428c37eef80c576fee815ce8a12fb83c37eef768ed41a981af3086b6283ee
 size 5905

 version https://git-lfs.github.com/spec/v1
+oid sha256:990710b9302c0c8b77c901eccfa5bc774039bdf414139adde64d79d3abef9639
 size 5905