maximuspowers commited on
Commit
7ba442a
·
verified ·
1 Parent(s): 17a5371

Model save

Browse files
README.md CHANGED
@@ -16,15 +16,15 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.5794
20
- - Exact Match Accuracy: 0.3644
21
- - Macro Precision: 0.3426
22
- - Macro Recall: 0.2254
23
- - Macro F1: 0.2522
24
- - Micro Precision: 0.7475
25
- - Micro Recall: 0.4654
26
- - Micro F1: 0.5736
27
- - Hamming Loss: 0.0548
28
 
29
  ## Model description
30
 
@@ -52,28 +52,31 @@ The following hyperparameters were used during training:
52
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
53
  - lr_scheduler_type: linear
54
  - lr_scheduler_warmup_steps: 100
55
- - num_epochs: 5000
56
  - mixed_precision_training: Native AMP
57
 
58
  ### Training results
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
62
- | 1.8234 | 1.6949 | 100 | 0.9048 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0804 |
63
- | 1.3183 | 3.3898 | 200 | 0.6790 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0804 |
64
- | 1.0722 | 5.0847 | 300 | 0.6234 | 0.0171 | 0.0588 | 0.0061 | 0.0110 | 1.0 | 0.0187 | 0.0368 | 0.0789 |
65
- | 0.9301 | 6.7797 | 400 | 0.5448 | 0.1795 | 0.1103 | 0.0599 | 0.0775 | 0.9375 | 0.1875 | 0.3125 | 0.0664 |
66
- | 0.9091 | 8.4746 | 500 | 0.5309 | 0.2222 | 0.1090 | 0.0757 | 0.0894 | 0.9268 | 0.2375 | 0.3781 | 0.0628 |
67
- | 0.7804 | 10.1695 | 600 | 0.4750 | 0.2650 | 0.1621 | 0.1112 | 0.1284 | 0.8889 | 0.3 | 0.4486 | 0.0593 |
68
- | 0.6878 | 11.8644 | 700 | 0.4607 | 0.2991 | 0.1669 | 0.1327 | 0.1473 | 0.9298 | 0.3312 | 0.4885 | 0.0558 |
69
- | 0.6323 | 13.5593 | 800 | 0.4214 | 0.3590 | 0.2152 | 0.1563 | 0.1792 | 0.8955 | 0.375 | 0.5286 | 0.0538 |
70
- | 0.6086 | 15.2542 | 900 | 0.4319 | 0.3419 | 0.2231 | 0.1500 | 0.1723 | 0.9206 | 0.3625 | 0.5202 | 0.0538 |
71
- | 0.5586 | 16.9492 | 1000 | 0.4255 | 0.3761 | 0.2018 | 0.1725 | 0.1859 | 0.8375 | 0.4188 | 0.5583 | 0.0533 |
72
- | 0.4776 | 18.6441 | 1100 | 0.3726 | 0.3675 | 0.2614 | 0.1772 | 0.2012 | 0.8553 | 0.4062 | 0.5508 | 0.0533 |
73
- | 0.4646 | 20.3390 | 1200 | 0.4319 | 0.4103 | 0.3810 | 0.1928 | 0.2273 | 0.8642 | 0.4375 | 0.5809 | 0.0508 |
74
- | 0.4375 | 22.0339 | 1300 | 0.4517 | 0.3932 | 0.3493 | 0.1904 | 0.2234 | 0.8608 | 0.425 | 0.5690 | 0.0518 |
75
- | 0.406 | 23.7288 | 1400 | 0.3794 | 0.4103 | 0.3329 | 0.1989 | 0.2330 | 0.8140 | 0.4375 | 0.5691 | 0.0533 |
76
- | 0.3545 | 25.4237 | 1500 | 0.3680 | 0.4274 | 0.2731 | 0.1999 | 0.2261 | 0.8 | 0.45 | 0.576 | 0.0533 |
 
 
 
77
 
78
 
79
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.5317
20
+ - Exact Match Accuracy: 0.3793
21
+ - Macro Precision: 0.3860
22
+ - Macro Recall: 0.2534
23
+ - Macro F1: 0.2900
24
+ - Micro Precision: 0.7953
25
+ - Micro Recall: 0.5025
26
+ - Micro F1: 0.6159
27
+ - Hamming Loss: 0.0511
28
 
29
  ## Model description
30
 
 
52
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
53
  - lr_scheduler_type: linear
54
  - lr_scheduler_warmup_steps: 100
55
+ - num_epochs: 500
56
  - mixed_precision_training: Native AMP
57
 
58
  ### Training results
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
62
+ | 1.8197 | 1.3724 | 100 | 0.8927 | 0.0207 | 0.0208 | 0.0118 | 0.0150 | 0.3333 | 0.0299 | 0.0548 | 0.0840 |
63
+ | 1.2941 | 2.7448 | 200 | 0.7412 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
64
+ | 1.1363 | 4.1103 | 300 | 0.7610 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
65
+ | 0.9381 | 5.4828 | 400 | 0.6290 | 0.0690 | 0.0588 | 0.0200 | 0.0299 | 1.0 | 0.0796 | 0.1475 | 0.0751 |
66
+ | 0.9246 | 6.8552 | 500 | 0.5927 | 0.1862 | 0.1176 | 0.0610 | 0.0803 | 1.0 | 0.1990 | 0.3320 | 0.0653 |
67
+ | 0.8558 | 8.2207 | 600 | 0.6186 | 0.2621 | 0.1680 | 0.0969 | 0.1206 | 0.9298 | 0.2637 | 0.4109 | 0.0617 |
68
+ | 0.7893 | 9.5931 | 700 | 0.6193 | 0.2483 | 0.1702 | 0.1259 | 0.1410 | 0.9483 | 0.2736 | 0.4247 | 0.0604 |
69
+ | 0.6894 | 10.9655 | 800 | 0.5421 | 0.2759 | 0.2267 | 0.1361 | 0.1545 | 0.9403 | 0.3134 | 0.4701 | 0.0576 |
70
+ | 0.6505 | 12.3310 | 900 | 0.5643 | 0.3034 | 0.2138 | 0.1478 | 0.1695 | 0.92 | 0.3433 | 0.5 | 0.0560 |
71
+ | 0.6225 | 13.7034 | 1000 | 0.5802 | 0.3034 | 0.2000 | 0.1585 | 0.1745 | 0.8471 | 0.3582 | 0.5035 | 0.0576 |
72
+ | 0.5806 | 15.0690 | 1100 | 0.6155 | 0.3448 | 0.1952 | 0.1541 | 0.1711 | 0.7917 | 0.3781 | 0.5118 | 0.0588 |
73
+ | 0.526 | 16.4414 | 1200 | 0.5498 | 0.3655 | 0.3618 | 0.2035 | 0.2416 | 0.8333 | 0.4478 | 0.5825 | 0.0523 |
74
+ | 0.4805 | 17.8138 | 1300 | 0.5925 | 0.3793 | 0.3585 | 0.1982 | 0.2365 | 0.8431 | 0.4279 | 0.5677 | 0.0531 |
75
+ | 0.4522 | 19.1793 | 1400 | 0.5409 | 0.3862 | 0.2757 | 0.2045 | 0.2237 | 0.8070 | 0.4577 | 0.5841 | 0.0531 |
76
+ | 0.4181 | 20.5517 | 1500 | 0.5604 | 0.4138 | 0.3574 | 0.2410 | 0.2738 | 0.8417 | 0.5025 | 0.6293 | 0.0483 |
77
+ | 0.4105 | 21.9241 | 1600 | 0.5579 | 0.3586 | 0.3659 | 0.2188 | 0.2565 | 0.8641 | 0.4428 | 0.5855 | 0.0511 |
78
+ | 0.3702 | 23.2897 | 1700 | 0.5602 | 0.3862 | 0.3278 | 0.2288 | 0.2616 | 0.8348 | 0.4776 | 0.6076 | 0.0503 |
79
+ | 0.3586 | 24.6621 | 1800 | 0.5317 | 0.3793 | 0.3860 | 0.2534 | 0.2900 | 0.7953 | 0.5025 | 0.6159 | 0.0511 |
80
 
81
 
82
  ### Framework versions
all_results.json CHANGED
@@ -1,20 +1,8 @@
1
  {
2
- "epoch": 25.423728813559322,
3
- "eval_exact_match_accuracy": 0.3644067796610169,
4
- "eval_hamming_loss": 0.054835493519441676,
5
- "eval_loss": 0.5794392228126526,
6
- "eval_macro_f1": 0.25224180581323435,
7
- "eval_macro_precision": 0.3426470588235294,
8
- "eval_macro_recall": 0.2253973559120618,
9
- "eval_micro_f1": 0.5736434108527132,
10
- "eval_micro_precision": 0.7474747474747475,
11
- "eval_micro_recall": 0.46540880503144655,
12
- "eval_runtime": 0.5885,
13
- "eval_samples_per_second": 200.516,
14
- "eval_steps_per_second": 25.489,
15
  "total_flos": 0.0,
16
- "train_loss": 0.8584918467203776,
17
- "train_runtime": 398.2207,
18
- "train_samples_per_second": 11777.389,
19
- "train_steps_per_second": 740.795
20
  }
 
1
  {
2
+ "epoch": 24.662068965517243,
 
 
 
 
 
 
 
 
 
 
 
 
3
  "total_flos": 0.0,
4
+ "train_loss": 0.811522379981147,
5
+ "train_runtime": 483.752,
6
+ "train_samples_per_second": 1197.928,
7
+ "train_steps_per_second": 75.452
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f80cf8b86234b32d6da69750975e3848ebee918a639287b5196f041c9a1a5d10
3
  size 441154988
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac436ea42be3a0f79e978204ed481681e1f26f7ed9f262da8c7c3f62e54ecf1b
3
  size 441154988
runs/Jun16_16-31-08_3eb3419cb417/events.out.tfevents.1750091469.3eb3419cb417.1588.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a504c49d0da0d583f723facd3d03f367f893ce93c52be45bc27776ecf41b7097
3
+ size 235121
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 25.423728813559322,
3
  "total_flos": 0.0,
4
- "train_loss": 0.8584918467203776,
5
- "train_runtime": 398.2207,
6
- "train_samples_per_second": 11777.389,
7
- "train_steps_per_second": 740.795
8
  }
 
1
  {
2
+ "epoch": 24.662068965517243,
3
  "total_flos": 0.0,
4
+ "train_loss": 0.811522379981147,
5
+ "train_runtime": 483.752,
6
+ "train_samples_per_second": 1197.928,
7
+ "train_steps_per_second": 75.452
8
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aa637e108c515c2173c456a542b1be498ddc2128077a19d7d9fa37a84043406b
3
  size 5368
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ad470715ab72b871d3a65bdcdea0b6b269785151fb87625772b65f49119f35d
3
  size 5368