stefan-it's picture
Upload folder using huggingface_hub
2a7e9ed
2023-10-17 09:03:10,253 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,254 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 09:03:10,254 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,254 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-17 09:03:10,254 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,254 Train: 1100 sentences
2023-10-17 09:03:10,255 (train_with_dev=False, train_with_test=False)
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Training Params:
2023-10-17 09:03:10,255 - learning_rate: "5e-05"
2023-10-17 09:03:10,255 - mini_batch_size: "8"
2023-10-17 09:03:10,255 - max_epochs: "10"
2023-10-17 09:03:10,255 - shuffle: "True"
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Plugins:
2023-10-17 09:03:10,255 - TensorboardLogger
2023-10-17 09:03:10,255 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 09:03:10,255 - metric: "('micro avg', 'f1-score')"
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Computation:
2023-10-17 09:03:10,255 - compute on device: cuda:0
2023-10-17 09:03:10,255 - embedding storage: none
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:10,255 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 09:03:10,990 epoch 1 - iter 13/138 - loss 4.34449325 - time (sec): 0.73 - samples/sec: 3126.64 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:03:11,759 epoch 1 - iter 26/138 - loss 3.83895096 - time (sec): 1.50 - samples/sec: 3031.85 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:03:12,464 epoch 1 - iter 39/138 - loss 3.09034858 - time (sec): 2.21 - samples/sec: 2996.99 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:03:13,157 epoch 1 - iter 52/138 - loss 2.61613044 - time (sec): 2.90 - samples/sec: 2949.59 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:03:13,918 epoch 1 - iter 65/138 - loss 2.22112321 - time (sec): 3.66 - samples/sec: 2982.26 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:03:14,650 epoch 1 - iter 78/138 - loss 1.94183633 - time (sec): 4.39 - samples/sec: 2977.37 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:03:15,432 epoch 1 - iter 91/138 - loss 1.72480160 - time (sec): 5.18 - samples/sec: 2978.25 - lr: 0.000033 - momentum: 0.000000
2023-10-17 09:03:16,197 epoch 1 - iter 104/138 - loss 1.55787388 - time (sec): 5.94 - samples/sec: 2996.16 - lr: 0.000037 - momentum: 0.000000
2023-10-17 09:03:16,905 epoch 1 - iter 117/138 - loss 1.43993761 - time (sec): 6.65 - samples/sec: 2966.71 - lr: 0.000042 - momentum: 0.000000
2023-10-17 09:03:17,610 epoch 1 - iter 130/138 - loss 1.34684571 - time (sec): 7.35 - samples/sec: 2912.05 - lr: 0.000047 - momentum: 0.000000
2023-10-17 09:03:18,066 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:18,066 EPOCH 1 done: loss 1.2826 - lr: 0.000047
2023-10-17 09:03:18,592 DEV : loss 0.20709875226020813 - f1-score (micro avg) 0.711
2023-10-17 09:03:18,597 saving best model
2023-10-17 09:03:18,928 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:19,627 epoch 2 - iter 13/138 - loss 0.21034731 - time (sec): 0.70 - samples/sec: 3012.15 - lr: 0.000050 - momentum: 0.000000
2023-10-17 09:03:20,325 epoch 2 - iter 26/138 - loss 0.24311320 - time (sec): 1.40 - samples/sec: 2932.03 - lr: 0.000049 - momentum: 0.000000
2023-10-17 09:03:21,069 epoch 2 - iter 39/138 - loss 0.21165363 - time (sec): 2.14 - samples/sec: 2924.06 - lr: 0.000048 - momentum: 0.000000
2023-10-17 09:03:21,783 epoch 2 - iter 52/138 - loss 0.20735016 - time (sec): 2.85 - samples/sec: 2909.81 - lr: 0.000048 - momentum: 0.000000
2023-10-17 09:03:22,527 epoch 2 - iter 65/138 - loss 0.20561587 - time (sec): 3.60 - samples/sec: 2859.69 - lr: 0.000047 - momentum: 0.000000
2023-10-17 09:03:23,248 epoch 2 - iter 78/138 - loss 0.20530059 - time (sec): 4.32 - samples/sec: 2882.50 - lr: 0.000047 - momentum: 0.000000
2023-10-17 09:03:23,990 epoch 2 - iter 91/138 - loss 0.19556341 - time (sec): 5.06 - samples/sec: 2910.66 - lr: 0.000046 - momentum: 0.000000
2023-10-17 09:03:24,708 epoch 2 - iter 104/138 - loss 0.18887887 - time (sec): 5.78 - samples/sec: 2914.79 - lr: 0.000046 - momentum: 0.000000
2023-10-17 09:03:25,493 epoch 2 - iter 117/138 - loss 0.18611780 - time (sec): 6.56 - samples/sec: 2921.95 - lr: 0.000045 - momentum: 0.000000
2023-10-17 09:03:26,271 epoch 2 - iter 130/138 - loss 0.18634067 - time (sec): 7.34 - samples/sec: 2920.06 - lr: 0.000045 - momentum: 0.000000
2023-10-17 09:03:26,701 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:26,702 EPOCH 2 done: loss 0.1829 - lr: 0.000045
2023-10-17 09:03:27,332 DEV : loss 0.14379316568374634 - f1-score (micro avg) 0.8042
2023-10-17 09:03:27,336 saving best model
2023-10-17 09:03:27,767 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:28,510 epoch 3 - iter 13/138 - loss 0.12890463 - time (sec): 0.74 - samples/sec: 3213.75 - lr: 0.000044 - momentum: 0.000000
2023-10-17 09:03:29,238 epoch 3 - iter 26/138 - loss 0.11806899 - time (sec): 1.47 - samples/sec: 3016.31 - lr: 0.000043 - momentum: 0.000000
2023-10-17 09:03:30,014 epoch 3 - iter 39/138 - loss 0.10922521 - time (sec): 2.24 - samples/sec: 2969.87 - lr: 0.000043 - momentum: 0.000000
2023-10-17 09:03:30,763 epoch 3 - iter 52/138 - loss 0.10655768 - time (sec): 2.99 - samples/sec: 2917.52 - lr: 0.000042 - momentum: 0.000000
2023-10-17 09:03:31,495 epoch 3 - iter 65/138 - loss 0.09842078 - time (sec): 3.72 - samples/sec: 2956.63 - lr: 0.000042 - momentum: 0.000000
2023-10-17 09:03:32,209 epoch 3 - iter 78/138 - loss 0.09600035 - time (sec): 4.44 - samples/sec: 2939.18 - lr: 0.000041 - momentum: 0.000000
2023-10-17 09:03:32,908 epoch 3 - iter 91/138 - loss 0.10207986 - time (sec): 5.14 - samples/sec: 2935.96 - lr: 0.000041 - momentum: 0.000000
2023-10-17 09:03:33,651 epoch 3 - iter 104/138 - loss 0.10217768 - time (sec): 5.88 - samples/sec: 2932.07 - lr: 0.000040 - momentum: 0.000000
2023-10-17 09:03:34,426 epoch 3 - iter 117/138 - loss 0.10814331 - time (sec): 6.66 - samples/sec: 2947.89 - lr: 0.000040 - momentum: 0.000000
2023-10-17 09:03:35,118 epoch 3 - iter 130/138 - loss 0.10909505 - time (sec): 7.35 - samples/sec: 2926.12 - lr: 0.000039 - momentum: 0.000000
2023-10-17 09:03:35,580 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:35,580 EPOCH 3 done: loss 0.1061 - lr: 0.000039
2023-10-17 09:03:36,403 DEV : loss 0.14102572202682495 - f1-score (micro avg) 0.8478
2023-10-17 09:03:36,407 saving best model
2023-10-17 09:03:36,842 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:37,528 epoch 4 - iter 13/138 - loss 0.07443020 - time (sec): 0.68 - samples/sec: 2740.98 - lr: 0.000038 - momentum: 0.000000
2023-10-17 09:03:38,265 epoch 4 - iter 26/138 - loss 0.06219799 - time (sec): 1.42 - samples/sec: 2754.75 - lr: 0.000038 - momentum: 0.000000
2023-10-17 09:03:39,012 epoch 4 - iter 39/138 - loss 0.06022158 - time (sec): 2.16 - samples/sec: 2877.82 - lr: 0.000037 - momentum: 0.000000
2023-10-17 09:03:39,785 epoch 4 - iter 52/138 - loss 0.07090445 - time (sec): 2.94 - samples/sec: 2904.20 - lr: 0.000037 - momentum: 0.000000
2023-10-17 09:03:40,568 epoch 4 - iter 65/138 - loss 0.07015438 - time (sec): 3.72 - samples/sec: 2932.61 - lr: 0.000036 - momentum: 0.000000
2023-10-17 09:03:41,297 epoch 4 - iter 78/138 - loss 0.07353590 - time (sec): 4.45 - samples/sec: 2908.83 - lr: 0.000036 - momentum: 0.000000
2023-10-17 09:03:42,053 epoch 4 - iter 91/138 - loss 0.07475477 - time (sec): 5.20 - samples/sec: 2919.31 - lr: 0.000035 - momentum: 0.000000
2023-10-17 09:03:42,784 epoch 4 - iter 104/138 - loss 0.07841478 - time (sec): 5.94 - samples/sec: 2920.66 - lr: 0.000035 - momentum: 0.000000
2023-10-17 09:03:43,571 epoch 4 - iter 117/138 - loss 0.07206102 - time (sec): 6.72 - samples/sec: 2956.43 - lr: 0.000034 - momentum: 0.000000
2023-10-17 09:03:44,266 epoch 4 - iter 130/138 - loss 0.07223569 - time (sec): 7.42 - samples/sec: 2939.55 - lr: 0.000034 - momentum: 0.000000
2023-10-17 09:03:44,685 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:44,685 EPOCH 4 done: loss 0.0738 - lr: 0.000034
2023-10-17 09:03:45,320 DEV : loss 0.15711645781993866 - f1-score (micro avg) 0.8446
2023-10-17 09:03:45,324 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:46,095 epoch 5 - iter 13/138 - loss 0.03438179 - time (sec): 0.77 - samples/sec: 2790.72 - lr: 0.000033 - momentum: 0.000000
2023-10-17 09:03:46,823 epoch 5 - iter 26/138 - loss 0.06241037 - time (sec): 1.50 - samples/sec: 2957.80 - lr: 0.000032 - momentum: 0.000000
2023-10-17 09:03:47,586 epoch 5 - iter 39/138 - loss 0.05133553 - time (sec): 2.26 - samples/sec: 3061.93 - lr: 0.000032 - momentum: 0.000000
2023-10-17 09:03:48,312 epoch 5 - iter 52/138 - loss 0.05594041 - time (sec): 2.99 - samples/sec: 3020.65 - lr: 0.000031 - momentum: 0.000000
2023-10-17 09:03:49,045 epoch 5 - iter 65/138 - loss 0.05316076 - time (sec): 3.72 - samples/sec: 2995.00 - lr: 0.000031 - momentum: 0.000000
2023-10-17 09:03:49,804 epoch 5 - iter 78/138 - loss 0.05399155 - time (sec): 4.48 - samples/sec: 2990.60 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:03:50,520 epoch 5 - iter 91/138 - loss 0.05246937 - time (sec): 5.19 - samples/sec: 2974.13 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:03:51,274 epoch 5 - iter 104/138 - loss 0.05553051 - time (sec): 5.95 - samples/sec: 2937.01 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:03:51,989 epoch 5 - iter 117/138 - loss 0.05441623 - time (sec): 6.66 - samples/sec: 2920.34 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:03:52,703 epoch 5 - iter 130/138 - loss 0.05160270 - time (sec): 7.38 - samples/sec: 2916.30 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:03:53,151 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:53,152 EPOCH 5 done: loss 0.0497 - lr: 0.000028
2023-10-17 09:03:53,792 DEV : loss 0.14954015612602234 - f1-score (micro avg) 0.8791
2023-10-17 09:03:53,797 saving best model
2023-10-17 09:03:54,232 ----------------------------------------------------------------------------------------------------
2023-10-17 09:03:54,942 epoch 6 - iter 13/138 - loss 0.03362713 - time (sec): 0.71 - samples/sec: 3038.29 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:03:55,661 epoch 6 - iter 26/138 - loss 0.02705287 - time (sec): 1.43 - samples/sec: 3123.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:03:56,402 epoch 6 - iter 39/138 - loss 0.03086305 - time (sec): 2.17 - samples/sec: 3099.08 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:03:57,104 epoch 6 - iter 52/138 - loss 0.05033083 - time (sec): 2.87 - samples/sec: 3042.48 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:03:57,897 epoch 6 - iter 65/138 - loss 0.05051840 - time (sec): 3.66 - samples/sec: 2939.33 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:03:58,668 epoch 6 - iter 78/138 - loss 0.04905528 - time (sec): 4.43 - samples/sec: 2938.88 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:03:59,475 epoch 6 - iter 91/138 - loss 0.04919640 - time (sec): 5.24 - samples/sec: 2918.76 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:04:00,211 epoch 6 - iter 104/138 - loss 0.04772371 - time (sec): 5.98 - samples/sec: 2942.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:04:00,951 epoch 6 - iter 117/138 - loss 0.04568019 - time (sec): 6.72 - samples/sec: 2894.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:04:01,748 epoch 6 - iter 130/138 - loss 0.04357845 - time (sec): 7.51 - samples/sec: 2877.56 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:04:02,186 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:02,186 EPOCH 6 done: loss 0.0424 - lr: 0.000023
2023-10-17 09:04:02,828 DEV : loss 0.1787448674440384 - f1-score (micro avg) 0.8634
2023-10-17 09:04:02,832 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:03,603 epoch 7 - iter 13/138 - loss 0.01345038 - time (sec): 0.77 - samples/sec: 2784.96 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:04:04,352 epoch 7 - iter 26/138 - loss 0.03959321 - time (sec): 1.52 - samples/sec: 2830.58 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:04:05,150 epoch 7 - iter 39/138 - loss 0.03210217 - time (sec): 2.32 - samples/sec: 2785.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:04:05,928 epoch 7 - iter 52/138 - loss 0.03120905 - time (sec): 3.09 - samples/sec: 2771.80 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:04:06,731 epoch 7 - iter 65/138 - loss 0.03554875 - time (sec): 3.90 - samples/sec: 2758.78 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:04:07,522 epoch 7 - iter 78/138 - loss 0.03251758 - time (sec): 4.69 - samples/sec: 2730.50 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:04:08,268 epoch 7 - iter 91/138 - loss 0.03389760 - time (sec): 5.43 - samples/sec: 2745.50 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:04:09,059 epoch 7 - iter 104/138 - loss 0.03654826 - time (sec): 6.23 - samples/sec: 2761.83 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:04:09,848 epoch 7 - iter 117/138 - loss 0.03670270 - time (sec): 7.01 - samples/sec: 2772.04 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:04:10,648 epoch 7 - iter 130/138 - loss 0.03440902 - time (sec): 7.81 - samples/sec: 2766.88 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:04:11,122 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:11,122 EPOCH 7 done: loss 0.0333 - lr: 0.000017
2023-10-17 09:04:11,754 DEV : loss 0.19050779938697815 - f1-score (micro avg) 0.8612
2023-10-17 09:04:11,759 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:12,517 epoch 8 - iter 13/138 - loss 0.02015145 - time (sec): 0.76 - samples/sec: 3017.27 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:04:13,301 epoch 8 - iter 26/138 - loss 0.01308457 - time (sec): 1.54 - samples/sec: 2921.55 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:04:14,009 epoch 8 - iter 39/138 - loss 0.01692323 - time (sec): 2.25 - samples/sec: 2939.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:04:14,756 epoch 8 - iter 52/138 - loss 0.02947952 - time (sec): 3.00 - samples/sec: 2898.09 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:04:15,463 epoch 8 - iter 65/138 - loss 0.03160918 - time (sec): 3.70 - samples/sec: 2867.44 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:04:16,247 epoch 8 - iter 78/138 - loss 0.02725271 - time (sec): 4.49 - samples/sec: 2872.47 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:04:16,977 epoch 8 - iter 91/138 - loss 0.02765796 - time (sec): 5.22 - samples/sec: 2888.80 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:04:17,766 epoch 8 - iter 104/138 - loss 0.02763216 - time (sec): 6.01 - samples/sec: 2867.17 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:04:18,500 epoch 8 - iter 117/138 - loss 0.02611747 - time (sec): 6.74 - samples/sec: 2881.75 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:04:19,271 epoch 8 - iter 130/138 - loss 0.02490687 - time (sec): 7.51 - samples/sec: 2887.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:04:19,726 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:19,726 EPOCH 8 done: loss 0.0239 - lr: 0.000012
2023-10-17 09:04:20,376 DEV : loss 0.19482243061065674 - f1-score (micro avg) 0.8705
2023-10-17 09:04:20,381 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:21,102 epoch 9 - iter 13/138 - loss 0.02070592 - time (sec): 0.72 - samples/sec: 3060.82 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:04:21,845 epoch 9 - iter 26/138 - loss 0.01756733 - time (sec): 1.46 - samples/sec: 2870.82 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:04:22,610 epoch 9 - iter 39/138 - loss 0.01386961 - time (sec): 2.23 - samples/sec: 2860.04 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:04:23,375 epoch 9 - iter 52/138 - loss 0.01733262 - time (sec): 2.99 - samples/sec: 2821.82 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:04:24,190 epoch 9 - iter 65/138 - loss 0.01574975 - time (sec): 3.81 - samples/sec: 2807.04 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:04:24,914 epoch 9 - iter 78/138 - loss 0.01757397 - time (sec): 4.53 - samples/sec: 2796.47 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:04:25,711 epoch 9 - iter 91/138 - loss 0.01639540 - time (sec): 5.33 - samples/sec: 2798.21 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:04:26,548 epoch 9 - iter 104/138 - loss 0.01521930 - time (sec): 6.17 - samples/sec: 2843.22 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:04:27,303 epoch 9 - iter 117/138 - loss 0.01625791 - time (sec): 6.92 - samples/sec: 2843.39 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:04:28,016 epoch 9 - iter 130/138 - loss 0.01675729 - time (sec): 7.63 - samples/sec: 2831.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:04:28,443 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:28,443 EPOCH 9 done: loss 0.0159 - lr: 0.000006
2023-10-17 09:04:29,100 DEV : loss 0.19875562191009521 - f1-score (micro avg) 0.8768
2023-10-17 09:04:29,104 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:29,798 epoch 10 - iter 13/138 - loss 0.01526788 - time (sec): 0.69 - samples/sec: 2775.72 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:04:30,535 epoch 10 - iter 26/138 - loss 0.01801766 - time (sec): 1.43 - samples/sec: 2765.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:04:31,279 epoch 10 - iter 39/138 - loss 0.01657569 - time (sec): 2.17 - samples/sec: 2768.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:04:32,038 epoch 10 - iter 52/138 - loss 0.01319114 - time (sec): 2.93 - samples/sec: 2899.01 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:04:32,757 epoch 10 - iter 65/138 - loss 0.01757314 - time (sec): 3.65 - samples/sec: 2934.77 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:04:33,531 epoch 10 - iter 78/138 - loss 0.02160599 - time (sec): 4.43 - samples/sec: 2946.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:04:34,267 epoch 10 - iter 91/138 - loss 0.01964171 - time (sec): 5.16 - samples/sec: 2933.30 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:04:34,980 epoch 10 - iter 104/138 - loss 0.01723832 - time (sec): 5.87 - samples/sec: 2944.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:04:35,710 epoch 10 - iter 117/138 - loss 0.01535274 - time (sec): 6.60 - samples/sec: 2943.25 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:04:36,483 epoch 10 - iter 130/138 - loss 0.01649454 - time (sec): 7.38 - samples/sec: 2928.83 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:04:36,938 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:36,938 EPOCH 10 done: loss 0.0158 - lr: 0.000000
2023-10-17 09:04:37,613 DEV : loss 0.20341137051582336 - f1-score (micro avg) 0.8723
2023-10-17 09:04:37,967 ----------------------------------------------------------------------------------------------------
2023-10-17 09:04:37,968 Loading model from best epoch ...
2023-10-17 09:04:39,326 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 09:04:40,131
Results:
- F-score (micro) 0.9216
- F-score (macro) 0.6862
- Accuracy 0.8589
By class:
precision recall f1-score support
scope 0.9302 0.9091 0.9195 176
pers 0.9758 0.9453 0.9603 128
work 0.8904 0.8784 0.8844 74
object 0.0000 0.0000 0.0000 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9353 0.9084 0.9216 382
macro avg 0.7593 0.6466 0.6862 382
weighted avg 0.9333 0.9084 0.9202 382
2023-10-17 09:04:40,131 ----------------------------------------------------------------------------------------------------