End of training
Browse files
README.md
CHANGED
@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
|
|
19 |
|
20 |
This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
|
21 |
It achieves the following results on the evaluation set:
|
22 |
-
- Loss:
|
23 |
-
- Accuracy: 0.
|
24 |
-
- F1: 0.
|
25 |
|
26 |
## Model description
|
27 |
|
@@ -40,38 +40,34 @@ More information needed
|
|
40 |
### Training hyperparameters
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
-
- learning_rate:
|
44 |
-
- train_batch_size:
|
45 |
-
- eval_batch_size:
|
46 |
- seed: 42
|
47 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
48 |
- lr_scheduler_type: linear
|
49 |
-
- lr_scheduler_warmup_ratio: 0.
|
50 |
- num_epochs: 5
|
51 |
|
52 |
### Training results
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|
55 |
|:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.0268 | 4.1451 | 1600 | 0.5958 | 0.9399 | 0.9396 |
|
72 |
-
| 0.007 | 4.4041 | 1700 | 0.5955 | 0.9379 | 0.9377 |
|
73 |
-
| 0.0052 | 4.6632 | 1800 | 0.6330 | 0.9400 | 0.9397 |
|
74 |
-
| 0.0049 | 4.9223 | 1900 | 0.6234 | 0.9414 | 0.9411 |
|
75 |
|
76 |
|
77 |
### Framework versions
|
|
|
19 |
|
20 |
This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
|
21 |
It achieves the following results on the evaluation set:
|
22 |
+
- Loss: 1.1409
|
23 |
+
- Accuracy: 0.7115
|
24 |
+
- F1: 0.7184
|
25 |
|
26 |
## Model description
|
27 |
|
|
|
40 |
### Training hyperparameters
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
+
- learning_rate: 5e-05
|
44 |
+
- train_batch_size: 256
|
45 |
+
- eval_batch_size: 256
|
46 |
- seed: 42
|
47 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
48 |
- lr_scheduler_type: linear
|
49 |
+
- lr_scheduler_warmup_ratio: 0.1
|
50 |
- num_epochs: 5
|
51 |
|
52 |
### Training results
|
53 |
|
54 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|
55 |
|:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
|
56 |
+
| 5.0513 | 0.3333 | 226 | 4.6666 | 0.0150 | 0.0139 |
|
57 |
+
| 2.9839 | 0.6667 | 452 | 2.4637 | 0.2933 | 0.3601 |
|
58 |
+
| 2.0766 | 1.0 | 678 | 1.8938 | 0.4410 | 0.5005 |
|
59 |
+
| 1.5464 | 1.3333 | 904 | 1.6542 | 0.4547 | 0.5265 |
|
60 |
+
| 1.4301 | 1.6667 | 1130 | 1.4822 | 0.4976 | 0.5625 |
|
61 |
+
| 1.2864 | 2.0 | 1356 | 1.3587 | 0.4388 | 0.5155 |
|
62 |
+
| 0.7659 | 2.3333 | 1582 | 1.2553 | 0.5637 | 0.6038 |
|
63 |
+
| 0.7489 | 2.6667 | 1808 | 1.1776 | 0.5639 | 0.6072 |
|
64 |
+
| 0.658 | 3.0 | 2034 | 1.1178 | 0.5851 | 0.6249 |
|
65 |
+
| 0.3545 | 3.3333 | 2260 | 1.0968 | 0.6086 | 0.6372 |
|
66 |
+
| 0.3468 | 3.6667 | 2486 | 1.1013 | 0.6502 | 0.6693 |
|
67 |
+
| 0.3072 | 4.0 | 2712 | 1.0774 | 0.6637 | 0.6816 |
|
68 |
+
| 0.1741 | 4.3333 | 2938 | 1.1204 | 0.6946 | 0.7043 |
|
69 |
+
| 0.1531 | 4.6667 | 3164 | 1.1361 | 0.7065 | 0.7134 |
|
70 |
+
| 0.1556 | 5.0 | 3390 | 1.1409 | 0.7115 | 0.7184 |
|
|
|
|
|
|
|
|
|
71 |
|
72 |
|
73 |
### Framework versions
|