Model save
Browse files
README.md
CHANGED
@@ -16,15 +16,15 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
- Loss: 0.
|
20 |
-
- Exact Match Accuracy: 0.
|
21 |
-
- Macro Precision: 0.
|
22 |
-
- Macro Recall: 0.
|
23 |
-
- Macro F1: 0.
|
24 |
-
- Micro Precision: 0.
|
25 |
-
- Micro Recall: 0.
|
26 |
-
- Micro F1: 0.
|
27 |
-
- Hamming Loss: 0.
|
28 |
|
29 |
## Model description
|
30 |
|
@@ -59,24 +59,31 @@ The following hyperparameters were used during training:
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
|
61 |
|:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
|
62 |
-
| 1.
|
63 |
-
| 1.
|
64 |
-
| 1.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
|
82 |
### Framework versions
|
|
|
16 |
|
17 |
This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
- Loss: 0.4928
|
20 |
+
- Exact Match Accuracy: 0.4069
|
21 |
+
- Macro Precision: 0.5825
|
22 |
+
- Macro Recall: 0.4180
|
23 |
+
- Macro F1: 0.4727
|
24 |
+
- Micro Precision: 0.7718
|
25 |
+
- Micro Recall: 0.5721
|
26 |
+
- Micro F1: 0.6571
|
27 |
+
- Hamming Loss: 0.0487
|
28 |
|
29 |
## Model description
|
30 |
|
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
|
61 |
|:-------------:|:-------:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
|
62 |
+
| 1.8162 | 1.3724 | 100 | 0.9425 | 0.0069 | 0.0588 | 0.0013 | 0.0025 | 0.2 | 0.0050 | 0.0097 | 0.0828 |
|
63 |
+
| 1.3347 | 2.7448 | 200 | 0.7476 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0815 |
|
64 |
+
| 1.1131 | 4.1103 | 300 | 0.7062 | 0.0207 | 0.0588 | 0.0075 | 0.0133 | 1.0 | 0.0299 | 0.0580 | 0.0791 |
|
65 |
+
| 0.9328 | 5.4828 | 400 | 0.6182 | 0.1034 | 0.0588 | 0.0263 | 0.0363 | 1.0 | 0.1045 | 0.1892 | 0.0730 |
|
66 |
+
| 0.9142 | 6.8552 | 500 | 0.5719 | 0.2 | 0.1176 | 0.0661 | 0.0846 | 1.0 | 0.2189 | 0.3592 | 0.0637 |
|
67 |
+
| 0.8329 | 8.2207 | 600 | 0.6049 | 0.2621 | 0.1718 | 0.1076 | 0.1318 | 0.9649 | 0.2736 | 0.4264 | 0.0600 |
|
68 |
+
| 0.7638 | 9.5931 | 700 | 0.6275 | 0.2897 | 0.2302 | 0.1285 | 0.1500 | 0.9667 | 0.2886 | 0.4444 | 0.0588 |
|
69 |
+
| 0.6805 | 10.9655 | 800 | 0.5676 | 0.2828 | 0.2206 | 0.1411 | 0.1529 | 0.8919 | 0.3284 | 0.48 | 0.0580 |
|
70 |
+
| 0.6449 | 12.3310 | 900 | 0.5610 | 0.2966 | 0.1987 | 0.1484 | 0.1645 | 0.8519 | 0.3433 | 0.4894 | 0.0584 |
|
71 |
+
| 0.6151 | 13.7034 | 1000 | 0.5707 | 0.3241 | 0.2613 | 0.1556 | 0.1790 | 0.8780 | 0.3582 | 0.5088 | 0.0564 |
|
72 |
+
| 0.5713 | 15.0690 | 1100 | 0.5384 | 0.3586 | 0.2281 | 0.1705 | 0.1881 | 0.8316 | 0.3930 | 0.5338 | 0.0560 |
|
73 |
+
| 0.5272 | 16.4414 | 1200 | 0.5652 | 0.3655 | 0.3467 | 0.1968 | 0.2230 | 0.7928 | 0.4378 | 0.5641 | 0.0552 |
|
74 |
+
| 0.477 | 17.8138 | 1300 | 0.5931 | 0.3655 | 0.3196 | 0.2090 | 0.2326 | 0.7731 | 0.4577 | 0.575 | 0.0552 |
|
75 |
+
| 0.4533 | 19.1793 | 1400 | 0.5249 | 0.3862 | 0.3520 | 0.2328 | 0.2694 | 0.8496 | 0.4776 | 0.6115 | 0.0495 |
|
76 |
+
| 0.4087 | 20.5517 | 1500 | 0.4931 | 0.4138 | 0.3420 | 0.2410 | 0.2709 | 0.8279 | 0.5025 | 0.6254 | 0.0491 |
|
77 |
+
| 0.4131 | 21.9241 | 1600 | 0.5264 | 0.3862 | 0.3397 | 0.2215 | 0.2527 | 0.8174 | 0.4677 | 0.5949 | 0.0519 |
|
78 |
+
| 0.366 | 23.2897 | 1700 | 0.5911 | 0.4 | 0.3282 | 0.2348 | 0.2608 | 0.7953 | 0.5025 | 0.6159 | 0.0511 |
|
79 |
+
| 0.3653 | 24.6621 | 1800 | 0.5318 | 0.4069 | 0.3592 | 0.2589 | 0.2894 | 0.7910 | 0.5274 | 0.6328 | 0.0499 |
|
80 |
+
| 0.345 | 26.0276 | 1900 | 0.5098 | 0.4069 | 0.4302 | 0.2733 | 0.3123 | 0.8281 | 0.5274 | 0.6444 | 0.0475 |
|
81 |
+
| 0.3157 | 27.4 | 2000 | 0.5230 | 0.4483 | 0.5469 | 0.3472 | 0.3957 | 0.7947 | 0.5970 | 0.6818 | 0.0454 |
|
82 |
+
| 0.2823 | 28.7724 | 2100 | 0.5017 | 0.4069 | 0.4945 | 0.3191 | 0.3652 | 0.8438 | 0.5373 | 0.6565 | 0.0458 |
|
83 |
+
| 0.2878 | 30.1379 | 2200 | 0.5023 | 0.3931 | 0.5167 | 0.3218 | 0.3773 | 0.7786 | 0.5423 | 0.6393 | 0.0499 |
|
84 |
+
| 0.2838 | 31.5103 | 2300 | 0.5143 | 0.3931 | 0.4767 | 0.3381 | 0.3828 | 0.7714 | 0.5373 | 0.6334 | 0.0507 |
|
85 |
+
| 0.2613 | 32.8828 | 2400 | 0.5150 | 0.3931 | 0.5549 | 0.3581 | 0.4093 | 0.7838 | 0.5771 | 0.6648 | 0.0475 |
|
86 |
+
| 0.2392 | 34.2483 | 2500 | 0.4928 | 0.4069 | 0.5825 | 0.4180 | 0.4727 | 0.7718 | 0.5721 | 0.6571 | 0.0487 |
|
87 |
|
88 |
|
89 |
### Framework versions
|
all_results.json
CHANGED
@@ -1,20 +1,8 @@
|
|
1 |
{
|
2 |
-
"epoch":
|
3 |
-
"eval_exact_match_accuracy": 0.3931034482758621,
|
4 |
-
"eval_hamming_loss": 0.05354969574036511,
|
5 |
-
"eval_loss": 0.565125048160553,
|
6 |
-
"eval_macro_f1": 0.24889861157994672,
|
7 |
-
"eval_macro_precision": 0.3151783530460001,
|
8 |
-
"eval_macro_recall": 0.2381291916416587,
|
9 |
-
"eval_micro_f1": 0.5900621118012422,
|
10 |
-
"eval_micro_precision": 0.7916666666666666,
|
11 |
-
"eval_micro_recall": 0.47029702970297027,
|
12 |
-
"eval_runtime": 0.7539,
|
13 |
-
"eval_samples_per_second": 192.34,
|
14 |
-
"eval_steps_per_second": 25.203,
|
15 |
"total_flos": 0.0,
|
16 |
-
"train_loss": 0.
|
17 |
-
"train_runtime":
|
18 |
-
"train_samples_per_second":
|
19 |
-
"train_steps_per_second":
|
20 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 34.248275862068965,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
"total_flos": 0.0,
|
4 |
+
"train_loss": 0.657050135421753,
|
5 |
+
"train_runtime": 659.4996,
|
6 |
+
"train_samples_per_second": 878.697,
|
7 |
+
"train_steps_per_second": 55.345
|
8 |
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 441154988
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:77b6654fbfa8810541dd215204bdf1cd55a754088412abe65f4a27a6d029dd2b
|
3 |
size 441154988
|
runs/Jun16_16-53-57_509eee86a659/events.out.tfevents.1750092838.509eee86a659.1793.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7296efd0b090519d4ff79119e0f32a3479ee66056d1c891c2e0a0595c35fb43d
|
3 |
+
size 1559119
|
runs/Jun16_17-49-10_509eee86a659/events.out.tfevents.1750096151.509eee86a659.15800.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cb6d1c5a13b710f31bf81ab85490c4fe0690df1960abaa68ba31121ef29aa006
|
3 |
+
size 324294
|
train_results.json
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
{
|
2 |
-
"epoch":
|
3 |
"total_flos": 0.0,
|
4 |
-
"train_loss": 0.
|
5 |
-
"train_runtime":
|
6 |
-
"train_samples_per_second":
|
7 |
-
"train_steps_per_second":
|
8 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 34.248275862068965,
|
3 |
"total_flos": 0.0,
|
4 |
+
"train_loss": 0.657050135421753,
|
5 |
+
"train_runtime": 659.4996,
|
6 |
+
"train_samples_per_second": 878.697,
|
7 |
+
"train_steps_per_second": 55.345
|
8 |
}
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5368
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1b0bb7c81b8c45e3d10121df9345ada1dfdf18d52bfcda26b52c200076c03431
|
3 |
size 5368
|