maximuspowers commited on
Commit
b7f4ba2
·
verified ·
1 Parent(s): 8403722

Model save

Browse files
README.md CHANGED
@@ -16,15 +16,15 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.5651
20
- - Exact Match Accuracy: 0.3931
21
- - Macro Precision: 0.3152
22
- - Macro Recall: 0.2381
23
- - Macro F1: 0.2489
24
- - Micro Precision: 0.7917
25
- - Micro Recall: 0.4703
26
- - Micro F1: 0.5901
27
- - Hamming Loss: 0.0535
28
 
29
  ## Model description
30
 
@@ -59,24 +59,31 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
62
- | 1.8197 | 1.3724 | 100 | 0.8927 | 0.0207 | 0.0208 | 0.0118 | 0.0150 | 0.3333 | 0.0299 | 0.0548 | 0.0840 |
63
- | 1.2941 | 2.7448 | 200 | 0.7412 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
64
- | 1.1363 | 4.1103 | 300 | 0.7610 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
65
- | 0.9381 | 5.4828 | 400 | 0.6290 | 0.0690 | 0.0588 | 0.0200 | 0.0299 | 1.0 | 0.0796 | 0.1475 | 0.0751 |
66
- | 0.9246 | 6.8552 | 500 | 0.5927 | 0.1862 | 0.1176 | 0.0610 | 0.0803 | 1.0 | 0.1990 | 0.3320 | 0.0653 |
67
- | 0.8558 | 8.2207 | 600 | 0.6186 | 0.2621 | 0.1680 | 0.0969 | 0.1206 | 0.9298 | 0.2637 | 0.4109 | 0.0617 |
68
- | 0.7893 | 9.5931 | 700 | 0.6193 | 0.2483 | 0.1702 | 0.1259 | 0.1410 | 0.9483 | 0.2736 | 0.4247 | 0.0604 |
69
- | 0.6894 | 10.9655 | 800 | 0.5421 | 0.2759 | 0.2267 | 0.1361 | 0.1545 | 0.9403 | 0.3134 | 0.4701 | 0.0576 |
70
- | 0.6505 | 12.3310 | 900 | 0.5643 | 0.3034 | 0.2138 | 0.1478 | 0.1695 | 0.92 | 0.3433 | 0.5 | 0.0560 |
71
- | 0.6225 | 13.7034 | 1000 | 0.5802 | 0.3034 | 0.2000 | 0.1585 | 0.1745 | 0.8471 | 0.3582 | 0.5035 | 0.0576 |
72
- | 0.5806 | 15.0690 | 1100 | 0.6155 | 0.3448 | 0.1952 | 0.1541 | 0.1711 | 0.7917 | 0.3781 | 0.5118 | 0.0588 |
73
- | 0.526 | 16.4414 | 1200 | 0.5498 | 0.3655 | 0.3618 | 0.2035 | 0.2416 | 0.8333 | 0.4478 | 0.5825 | 0.0523 |
74
- | 0.4805 | 17.8138 | 1300 | 0.5925 | 0.3793 | 0.3585 | 0.1982 | 0.2365 | 0.8431 | 0.4279 | 0.5677 | 0.0531 |
75
- | 0.4522 | 19.1793 | 1400 | 0.5409 | 0.3862 | 0.2757 | 0.2045 | 0.2237 | 0.8070 | 0.4577 | 0.5841 | 0.0531 |
76
- | 0.4181 | 20.5517 | 1500 | 0.5604 | 0.4138 | 0.3574 | 0.2410 | 0.2738 | 0.8417 | 0.5025 | 0.6293 | 0.0483 |
77
- | 0.4105 | 21.9241 | 1600 | 0.5579 | 0.3586 | 0.3659 | 0.2188 | 0.2565 | 0.8641 | 0.4428 | 0.5855 | 0.0511 |
78
- | 0.3702 | 23.2897 | 1700 | 0.5602 | 0.3862 | 0.3278 | 0.2288 | 0.2616 | 0.8348 | 0.4776 | 0.6076 | 0.0503 |
79
- | 0.3586 | 24.6621 | 1800 | 0.5317 | 0.3793 | 0.3860 | 0.2534 | 0.2900 | 0.7953 | 0.5025 | 0.6159 | 0.0511 |
 
 
 
 
 
 
 
80
 
81
 
82
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.4928
20
+ - Exact Match Accuracy: 0.4069
21
+ - Macro Precision: 0.5825
22
+ - Macro Recall: 0.4180
23
+ - Macro F1: 0.4727
24
+ - Micro Precision: 0.7718
25
+ - Micro Recall: 0.5721
26
+ - Micro F1: 0.6571
27
+ - Hamming Loss: 0.0487
28
 
29
  ## Model description
30
 
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
62
+ | 1.8162 | 1.3724 | 100 | 0.9425 | 0.0069 | 0.0588 | 0.0013 | 0.0025 | 0.2 | 0.0050 | 0.0097 | 0.0828 |
63
+ | 1.3347 | 2.7448 | 200 | 0.7476 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
64
+ | 1.1131 | 4.1103 | 300 | 0.7062 | 0.0207 | 0.0588 | 0.0075 | 0.0133 | 1.0 | 0.0299 | 0.0580 | 0.0791 |
65
+ | 0.9328 | 5.4828 | 400 | 0.6182 | 0.1034 | 0.0588 | 0.0263 | 0.0363 | 1.0 | 0.1045 | 0.1892 | 0.0730 |
66
+ | 0.9142 | 6.8552 | 500 | 0.5719 | 0.2 | 0.1176 | 0.0661 | 0.0846 | 1.0 | 0.2189 | 0.3592 | 0.0637 |
67
+ | 0.8329 | 8.2207 | 600 | 0.6049 | 0.2621 | 0.1718 | 0.1076 | 0.1318 | 0.9649 | 0.2736 | 0.4264 | 0.0600 |
68
+ | 0.7638 | 9.5931 | 700 | 0.6275 | 0.2897 | 0.2302 | 0.1285 | 0.1500 | 0.9667 | 0.2886 | 0.4444 | 0.0588 |
69
+ | 0.6805 | 10.9655 | 800 | 0.5676 | 0.2828 | 0.2206 | 0.1411 | 0.1529 | 0.8919 | 0.3284 | 0.48 | 0.0580 |
70
+ | 0.6449 | 12.3310 | 900 | 0.5610 | 0.2966 | 0.1987 | 0.1484 | 0.1645 | 0.8519 | 0.3433 | 0.4894 | 0.0584 |
71
+ | 0.6151 | 13.7034 | 1000 | 0.5707 | 0.3241 | 0.2613 | 0.1556 | 0.1790 | 0.8780 | 0.3582 | 0.5088 | 0.0564 |
72
+ | 0.5713 | 15.0690 | 1100 | 0.5384 | 0.3586 | 0.2281 | 0.1705 | 0.1881 | 0.8316 | 0.3930 | 0.5338 | 0.0560 |
73
+ | 0.5272 | 16.4414 | 1200 | 0.5652 | 0.3655 | 0.3467 | 0.1968 | 0.2230 | 0.7928 | 0.4378 | 0.5641 | 0.0552 |
74
+ | 0.477 | 17.8138 | 1300 | 0.5931 | 0.3655 | 0.3196 | 0.2090 | 0.2326 | 0.7731 | 0.4577 | 0.575 | 0.0552 |
75
+ | 0.4533 | 19.1793 | 1400 | 0.5249 | 0.3862 | 0.3520 | 0.2328 | 0.2694 | 0.8496 | 0.4776 | 0.6115 | 0.0495 |
76
+ | 0.4087 | 20.5517 | 1500 | 0.4931 | 0.4138 | 0.3420 | 0.2410 | 0.2709 | 0.8279 | 0.5025 | 0.6254 | 0.0491 |
77
+ | 0.4131 | 21.9241 | 1600 | 0.5264 | 0.3862 | 0.3397 | 0.2215 | 0.2527 | 0.8174 | 0.4677 | 0.5949 | 0.0519 |
78
+ | 0.366 | 23.2897 | 1700 | 0.5911 | 0.4 | 0.3282 | 0.2348 | 0.2608 | 0.7953 | 0.5025 | 0.6159 | 0.0511 |
79
+ | 0.3653 | 24.6621 | 1800 | 0.5318 | 0.4069 | 0.3592 | 0.2589 | 0.2894 | 0.7910 | 0.5274 | 0.6328 | 0.0499 |
80
+ | 0.345 | 26.0276 | 1900 | 0.5098 | 0.4069 | 0.4302 | 0.2733 | 0.3123 | 0.8281 | 0.5274 | 0.6444 | 0.0475 |
81
+ | 0.3157 | 27.4 | 2000 | 0.5230 | 0.4483 | 0.5469 | 0.3472 | 0.3957 | 0.7947 | 0.5970 | 0.6818 | 0.0454 |
82
+ | 0.2823 | 28.7724 | 2100 | 0.5017 | 0.4069 | 0.4945 | 0.3191 | 0.3652 | 0.8438 | 0.5373 | 0.6565 | 0.0458 |
83
+ | 0.2878 | 30.1379 | 2200 | 0.5023 | 0.3931 | 0.5167 | 0.3218 | 0.3773 | 0.7786 | 0.5423 | 0.6393 | 0.0499 |
84
+ | 0.2838 | 31.5103 | 2300 | 0.5143 | 0.3931 | 0.4767 | 0.3381 | 0.3828 | 0.7714 | 0.5373 | 0.6334 | 0.0507 |
85
+ | 0.2613 | 32.8828 | 2400 | 0.5150 | 0.3931 | 0.5549 | 0.3581 | 0.4093 | 0.7838 | 0.5771 | 0.6648 | 0.0475 |
86
+ | 0.2392 | 34.2483 | 2500 | 0.4928 | 0.4069 | 0.5825 | 0.4180 | 0.4727 | 0.7718 | 0.5721 | 0.6571 | 0.0487 |
87
 
88
 
89
  ### Framework versions
all_results.json CHANGED
@@ -1,20 +1,8 @@
1
  {
2
- "epoch": 24.662068965517243,
3
- "eval_exact_match_accuracy": 0.3931034482758621,
4
- "eval_hamming_loss": 0.05354969574036511,
5
- "eval_loss": 0.565125048160553,
6
- "eval_macro_f1": 0.24889861157994672,
7
- "eval_macro_precision": 0.3151783530460001,
8
- "eval_macro_recall": 0.2381291916416587,
9
- "eval_micro_f1": 0.5900621118012422,
10
- "eval_micro_precision": 0.7916666666666666,
11
- "eval_micro_recall": 0.47029702970297027,
12
- "eval_runtime": 0.7539,
13
- "eval_samples_per_second": 192.34,
14
- "eval_steps_per_second": 25.203,
15
  "total_flos": 0.0,
16
- "train_loss": 0.811522379981147,
17
- "train_runtime": 483.752,
18
- "train_samples_per_second": 1197.928,
19
- "train_steps_per_second": 75.452
20
  }
 
1
  {
2
+ "epoch": 34.248275862068965,
 
 
 
 
 
 
 
 
 
 
 
 
3
  "total_flos": 0.0,
4
+ "train_loss": 0.657050135421753,
5
+ "train_runtime": 659.4996,
6
+ "train_samples_per_second": 878.697,
7
+ "train_steps_per_second": 55.345
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:609df78d6a7b3db0c1613d36a98872212a6c64505a260fc26787dda4771a05c7
3
  size 441154988
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77b6654fbfa8810541dd215204bdf1cd55a754088412abe65f4a27a6d029dd2b
3
  size 441154988
runs/Jun16_16-53-57_509eee86a659/events.out.tfevents.1750092838.509eee86a659.1793.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ff828cd0468171fdb9d133f11b3096743daa9b63edec494ddf2129bf310c1eec
3
- size 1533641
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7296efd0b090519d4ff79119e0f32a3479ee66056d1c891c2e0a0595c35fb43d
3
+ size 1559119
runs/Jun16_17-49-10_509eee86a659/events.out.tfevents.1750096151.509eee86a659.15800.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb6d1c5a13b710f31bf81ab85490c4fe0690df1960abaa68ba31121ef29aa006
3
+ size 324294
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 24.662068965517243,
3
  "total_flos": 0.0,
4
- "train_loss": 0.811522379981147,
5
- "train_runtime": 483.752,
6
- "train_samples_per_second": 1197.928,
7
- "train_steps_per_second": 75.452
8
  }
 
1
  {
2
+ "epoch": 34.248275862068965,
3
  "total_flos": 0.0,
4
+ "train_loss": 0.657050135421753,
5
+ "train_runtime": 659.4996,
6
+ "train_samples_per_second": 878.697,
7
+ "train_steps_per_second": 55.345
8
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ac11559a74276b49475aa5ed7b5c976e9b4e910499e502bc55cf102ea75aa6dc
3
  size 5368
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b0bb7c81b8c45e3d10121df9345ada1dfdf18d52bfcda26b52c200076c03431
3
  size 5368