stefan-it's picture
Upload folder using huggingface_hub
f6b19f9
2023-10-17 08:33:05,815 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,816 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 08:33:05,816 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Train: 1100 sentences
2023-10-17 08:33:05,817 (train_with_dev=False, train_with_test=False)
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Training Params:
2023-10-17 08:33:05,817 - learning_rate: "3e-05"
2023-10-17 08:33:05,817 - mini_batch_size: "8"
2023-10-17 08:33:05,817 - max_epochs: "10"
2023-10-17 08:33:05,817 - shuffle: "True"
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Plugins:
2023-10-17 08:33:05,817 - TensorboardLogger
2023-10-17 08:33:05,817 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 08:33:05,817 - metric: "('micro avg', 'f1-score')"
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Computation:
2023-10-17 08:33:05,817 - compute on device: cuda:0
2023-10-17 08:33:05,817 - embedding storage: none
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:05,817 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 08:33:06,552 epoch 1 - iter 13/138 - loss 4.28753939 - time (sec): 0.73 - samples/sec: 2901.96 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:33:07,279 epoch 1 - iter 26/138 - loss 3.98009475 - time (sec): 1.46 - samples/sec: 2766.78 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:33:07,999 epoch 1 - iter 39/138 - loss 3.42450071 - time (sec): 2.18 - samples/sec: 2765.31 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:33:08,759 epoch 1 - iter 52/138 - loss 2.85925833 - time (sec): 2.94 - samples/sec: 2734.01 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:33:09,554 epoch 1 - iter 65/138 - loss 2.38101125 - time (sec): 3.74 - samples/sec: 2773.71 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:33:10,309 epoch 1 - iter 78/138 - loss 2.08894183 - time (sec): 4.49 - samples/sec: 2797.37 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:33:11,027 epoch 1 - iter 91/138 - loss 1.84706738 - time (sec): 5.21 - samples/sec: 2834.51 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:33:11,776 epoch 1 - iter 104/138 - loss 1.66261830 - time (sec): 5.96 - samples/sec: 2849.79 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:33:12,503 epoch 1 - iter 117/138 - loss 1.51524851 - time (sec): 6.68 - samples/sec: 2880.93 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:33:13,252 epoch 1 - iter 130/138 - loss 1.40164455 - time (sec): 7.43 - samples/sec: 2886.35 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:33:13,732 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:13,732 EPOCH 1 done: loss 1.3423 - lr: 0.000028
2023-10-17 08:33:14,239 DEV : loss 0.24341529607772827 - f1-score (micro avg) 0.6894
2023-10-17 08:33:14,244 saving best model
2023-10-17 08:33:14,586 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:15,337 epoch 2 - iter 13/138 - loss 0.24064622 - time (sec): 0.75 - samples/sec: 2961.63 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:33:16,103 epoch 2 - iter 26/138 - loss 0.27935961 - time (sec): 1.52 - samples/sec: 3073.75 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:33:16,828 epoch 2 - iter 39/138 - loss 0.26559896 - time (sec): 2.24 - samples/sec: 3058.71 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:33:17,612 epoch 2 - iter 52/138 - loss 0.25213542 - time (sec): 3.02 - samples/sec: 3016.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:33:18,306 epoch 2 - iter 65/138 - loss 0.24071861 - time (sec): 3.72 - samples/sec: 2981.01 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:33:19,044 epoch 2 - iter 78/138 - loss 0.22995942 - time (sec): 4.46 - samples/sec: 2926.56 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:33:19,734 epoch 2 - iter 91/138 - loss 0.22101536 - time (sec): 5.15 - samples/sec: 2891.96 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:33:20,484 epoch 2 - iter 104/138 - loss 0.21519331 - time (sec): 5.90 - samples/sec: 2930.83 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:33:21,233 epoch 2 - iter 117/138 - loss 0.20822496 - time (sec): 6.65 - samples/sec: 2915.01 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:33:21,972 epoch 2 - iter 130/138 - loss 0.20635956 - time (sec): 7.38 - samples/sec: 2921.49 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:33:22,405 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:22,405 EPOCH 2 done: loss 0.2002 - lr: 0.000027
2023-10-17 08:33:23,030 DEV : loss 0.13450075685977936 - f1-score (micro avg) 0.8252
2023-10-17 08:33:23,034 saving best model
2023-10-17 08:33:23,482 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:24,264 epoch 3 - iter 13/138 - loss 0.12654998 - time (sec): 0.78 - samples/sec: 2867.55 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:33:25,000 epoch 3 - iter 26/138 - loss 0.11543664 - time (sec): 1.51 - samples/sec: 2866.92 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:33:25,724 epoch 3 - iter 39/138 - loss 0.12814653 - time (sec): 2.24 - samples/sec: 2929.11 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:33:26,434 epoch 3 - iter 52/138 - loss 0.13041800 - time (sec): 2.95 - samples/sec: 2925.34 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:33:27,160 epoch 3 - iter 65/138 - loss 0.11879921 - time (sec): 3.67 - samples/sec: 2945.18 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:33:27,853 epoch 3 - iter 78/138 - loss 0.11359335 - time (sec): 4.37 - samples/sec: 2912.66 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:33:28,602 epoch 3 - iter 91/138 - loss 0.11860361 - time (sec): 5.12 - samples/sec: 2967.10 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:33:29,309 epoch 3 - iter 104/138 - loss 0.11703615 - time (sec): 5.82 - samples/sec: 2920.50 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:33:30,063 epoch 3 - iter 117/138 - loss 0.11013480 - time (sec): 6.58 - samples/sec: 2939.18 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:33:30,800 epoch 3 - iter 130/138 - loss 0.11106985 - time (sec): 7.31 - samples/sec: 2928.86 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:33:31,262 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:31,263 EPOCH 3 done: loss 0.1087 - lr: 0.000024
2023-10-17 08:33:31,906 DEV : loss 0.15571120381355286 - f1-score (micro avg) 0.8469
2023-10-17 08:33:31,911 saving best model
2023-10-17 08:33:32,501 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:33,245 epoch 4 - iter 13/138 - loss 0.10562254 - time (sec): 0.74 - samples/sec: 2882.61 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:33:33,973 epoch 4 - iter 26/138 - loss 0.09434256 - time (sec): 1.47 - samples/sec: 2888.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:33:34,751 epoch 4 - iter 39/138 - loss 0.07785609 - time (sec): 2.25 - samples/sec: 2978.90 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:33:35,469 epoch 4 - iter 52/138 - loss 0.07270629 - time (sec): 2.97 - samples/sec: 2939.80 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:33:36,225 epoch 4 - iter 65/138 - loss 0.07277574 - time (sec): 3.72 - samples/sec: 2951.67 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:33:36,965 epoch 4 - iter 78/138 - loss 0.07000169 - time (sec): 4.46 - samples/sec: 2960.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:33:37,714 epoch 4 - iter 91/138 - loss 0.07538533 - time (sec): 5.21 - samples/sec: 2949.06 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:33:38,419 epoch 4 - iter 104/138 - loss 0.07254139 - time (sec): 5.92 - samples/sec: 2913.58 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:33:39,216 epoch 4 - iter 117/138 - loss 0.07310457 - time (sec): 6.71 - samples/sec: 2936.46 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:33:39,940 epoch 4 - iter 130/138 - loss 0.07133222 - time (sec): 7.44 - samples/sec: 2904.71 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:33:40,359 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:40,359 EPOCH 4 done: loss 0.0759 - lr: 0.000020
2023-10-17 08:33:40,996 DEV : loss 0.14280927181243896 - f1-score (micro avg) 0.8421
2023-10-17 08:33:41,001 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:41,836 epoch 5 - iter 13/138 - loss 0.03628837 - time (sec): 0.83 - samples/sec: 2936.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:33:42,547 epoch 5 - iter 26/138 - loss 0.03955584 - time (sec): 1.55 - samples/sec: 2817.96 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:33:43,255 epoch 5 - iter 39/138 - loss 0.05213873 - time (sec): 2.25 - samples/sec: 2897.43 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:33:44,023 epoch 5 - iter 52/138 - loss 0.04749352 - time (sec): 3.02 - samples/sec: 2835.83 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:33:44,761 epoch 5 - iter 65/138 - loss 0.04350353 - time (sec): 3.76 - samples/sec: 2841.21 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:33:45,512 epoch 5 - iter 78/138 - loss 0.04751334 - time (sec): 4.51 - samples/sec: 2854.47 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:33:46,231 epoch 5 - iter 91/138 - loss 0.05181517 - time (sec): 5.23 - samples/sec: 2892.82 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:33:46,989 epoch 5 - iter 104/138 - loss 0.05247513 - time (sec): 5.99 - samples/sec: 2895.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:33:47,724 epoch 5 - iter 117/138 - loss 0.05168931 - time (sec): 6.72 - samples/sec: 2900.32 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:33:48,439 epoch 5 - iter 130/138 - loss 0.05509507 - time (sec): 7.44 - samples/sec: 2893.57 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:33:48,873 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:48,873 EPOCH 5 done: loss 0.0581 - lr: 0.000017
2023-10-17 08:33:49,515 DEV : loss 0.13866549730300903 - f1-score (micro avg) 0.8647
2023-10-17 08:33:49,519 saving best model
2023-10-17 08:33:49,944 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:50,705 epoch 6 - iter 13/138 - loss 0.05108964 - time (sec): 0.76 - samples/sec: 2927.02 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:33:51,395 epoch 6 - iter 26/138 - loss 0.03442384 - time (sec): 1.45 - samples/sec: 2848.95 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:33:52,139 epoch 6 - iter 39/138 - loss 0.03497665 - time (sec): 2.19 - samples/sec: 2883.69 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:33:52,879 epoch 6 - iter 52/138 - loss 0.03726733 - time (sec): 2.93 - samples/sec: 2897.23 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:33:53,605 epoch 6 - iter 65/138 - loss 0.04078750 - time (sec): 3.66 - samples/sec: 2897.78 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:33:54,399 epoch 6 - iter 78/138 - loss 0.04584713 - time (sec): 4.45 - samples/sec: 2883.66 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:33:55,255 epoch 6 - iter 91/138 - loss 0.05028594 - time (sec): 5.31 - samples/sec: 2892.99 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:33:56,023 epoch 6 - iter 104/138 - loss 0.04900960 - time (sec): 6.07 - samples/sec: 2877.61 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:33:56,761 epoch 6 - iter 117/138 - loss 0.04874107 - time (sec): 6.81 - samples/sec: 2857.52 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:33:57,537 epoch 6 - iter 130/138 - loss 0.04725876 - time (sec): 7.59 - samples/sec: 2846.62 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:33:58,001 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:58,001 EPOCH 6 done: loss 0.0458 - lr: 0.000014
2023-10-17 08:33:58,643 DEV : loss 0.18001507222652435 - f1-score (micro avg) 0.8534
2023-10-17 08:33:58,648 ----------------------------------------------------------------------------------------------------
2023-10-17 08:33:59,397 epoch 7 - iter 13/138 - loss 0.05159695 - time (sec): 0.75 - samples/sec: 3073.57 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:34:00,167 epoch 7 - iter 26/138 - loss 0.03561773 - time (sec): 1.52 - samples/sec: 2995.94 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:34:00,889 epoch 7 - iter 39/138 - loss 0.03730540 - time (sec): 2.24 - samples/sec: 2948.01 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:34:01,642 epoch 7 - iter 52/138 - loss 0.05310563 - time (sec): 2.99 - samples/sec: 2917.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:34:02,371 epoch 7 - iter 65/138 - loss 0.04962399 - time (sec): 3.72 - samples/sec: 2956.32 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:34:03,099 epoch 7 - iter 78/138 - loss 0.04930191 - time (sec): 4.45 - samples/sec: 2922.57 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:34:03,875 epoch 7 - iter 91/138 - loss 0.04630191 - time (sec): 5.23 - samples/sec: 2889.42 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:34:04,716 epoch 7 - iter 104/138 - loss 0.04640907 - time (sec): 6.07 - samples/sec: 2846.09 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:34:05,432 epoch 7 - iter 117/138 - loss 0.04455253 - time (sec): 6.78 - samples/sec: 2833.03 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:34:06,191 epoch 7 - iter 130/138 - loss 0.04186703 - time (sec): 7.54 - samples/sec: 2852.68 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:34:06,659 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:06,660 EPOCH 7 done: loss 0.0407 - lr: 0.000010
2023-10-17 08:34:07,303 DEV : loss 0.16692401468753815 - f1-score (micro avg) 0.8851
2023-10-17 08:34:07,307 saving best model
2023-10-17 08:34:07,752 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:08,506 epoch 8 - iter 13/138 - loss 0.02291377 - time (sec): 0.75 - samples/sec: 2838.18 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:34:09,268 epoch 8 - iter 26/138 - loss 0.01201702 - time (sec): 1.51 - samples/sec: 2772.61 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:34:10,011 epoch 8 - iter 39/138 - loss 0.01923593 - time (sec): 2.26 - samples/sec: 2787.04 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:34:10,830 epoch 8 - iter 52/138 - loss 0.02320055 - time (sec): 3.08 - samples/sec: 2812.63 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:34:11,579 epoch 8 - iter 65/138 - loss 0.02221101 - time (sec): 3.82 - samples/sec: 2838.72 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:34:12,387 epoch 8 - iter 78/138 - loss 0.02326176 - time (sec): 4.63 - samples/sec: 2822.85 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:34:13,159 epoch 8 - iter 91/138 - loss 0.02456120 - time (sec): 5.40 - samples/sec: 2792.64 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:34:13,920 epoch 8 - iter 104/138 - loss 0.02607922 - time (sec): 6.17 - samples/sec: 2800.30 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:34:14,697 epoch 8 - iter 117/138 - loss 0.02884882 - time (sec): 6.94 - samples/sec: 2791.39 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:34:15,429 epoch 8 - iter 130/138 - loss 0.03309474 - time (sec): 7.67 - samples/sec: 2793.53 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:34:15,921 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:15,922 EPOCH 8 done: loss 0.0335 - lr: 0.000007
2023-10-17 08:34:16,564 DEV : loss 0.1874910145998001 - f1-score (micro avg) 0.8647
2023-10-17 08:34:16,569 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:17,346 epoch 9 - iter 13/138 - loss 0.01605552 - time (sec): 0.78 - samples/sec: 2675.02 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:34:18,071 epoch 9 - iter 26/138 - loss 0.01340910 - time (sec): 1.50 - samples/sec: 2768.24 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:34:18,816 epoch 9 - iter 39/138 - loss 0.01023730 - time (sec): 2.25 - samples/sec: 2747.22 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:34:19,587 epoch 9 - iter 52/138 - loss 0.02708611 - time (sec): 3.02 - samples/sec: 2840.69 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:34:20,304 epoch 9 - iter 65/138 - loss 0.02744843 - time (sec): 3.73 - samples/sec: 2837.85 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:34:21,086 epoch 9 - iter 78/138 - loss 0.03233667 - time (sec): 4.52 - samples/sec: 2849.74 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:34:21,835 epoch 9 - iter 91/138 - loss 0.02978154 - time (sec): 5.26 - samples/sec: 2856.83 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:34:22,572 epoch 9 - iter 104/138 - loss 0.03022690 - time (sec): 6.00 - samples/sec: 2843.88 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:34:23,406 epoch 9 - iter 117/138 - loss 0.02980306 - time (sec): 6.84 - samples/sec: 2824.96 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:34:24,147 epoch 9 - iter 130/138 - loss 0.03181463 - time (sec): 7.58 - samples/sec: 2842.79 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:34:24,565 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:24,566 EPOCH 9 done: loss 0.0301 - lr: 0.000004
2023-10-17 08:34:25,201 DEV : loss 0.19798633456230164 - f1-score (micro avg) 0.8654
2023-10-17 08:34:25,206 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:25,953 epoch 10 - iter 13/138 - loss 0.00177827 - time (sec): 0.75 - samples/sec: 2780.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:34:26,732 epoch 10 - iter 26/138 - loss 0.00970498 - time (sec): 1.52 - samples/sec: 2841.59 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:34:27,491 epoch 10 - iter 39/138 - loss 0.01134777 - time (sec): 2.28 - samples/sec: 2911.31 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:34:28,298 epoch 10 - iter 52/138 - loss 0.02330497 - time (sec): 3.09 - samples/sec: 2862.32 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:34:29,047 epoch 10 - iter 65/138 - loss 0.02650525 - time (sec): 3.84 - samples/sec: 2814.03 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:34:29,813 epoch 10 - iter 78/138 - loss 0.02411444 - time (sec): 4.61 - samples/sec: 2825.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:34:30,578 epoch 10 - iter 91/138 - loss 0.02513531 - time (sec): 5.37 - samples/sec: 2836.94 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:34:31,340 epoch 10 - iter 104/138 - loss 0.02436794 - time (sec): 6.13 - samples/sec: 2851.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:34:32,070 epoch 10 - iter 117/138 - loss 0.02583156 - time (sec): 6.86 - samples/sec: 2828.60 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:34:32,783 epoch 10 - iter 130/138 - loss 0.02747905 - time (sec): 7.58 - samples/sec: 2829.19 - lr: 0.000000 - momentum: 0.000000
2023-10-17 08:34:33,225 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:33,225 EPOCH 10 done: loss 0.0265 - lr: 0.000000
2023-10-17 08:34:33,910 DEV : loss 0.19013234972953796 - f1-score (micro avg) 0.8699
2023-10-17 08:34:34,278 ----------------------------------------------------------------------------------------------------
2023-10-17 08:34:34,279 Loading model from best epoch ...
2023-10-17 08:34:35,660 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 08:34:36,484
Results:
- F-score (micro) 0.8997
- F-score (macro) 0.6734
- Accuracy 0.8237
By class:
precision recall f1-score support
scope 0.9006 0.8750 0.8876 176
pers 0.9389 0.9609 0.9498 128
work 0.8750 0.8514 0.8630 74
object 0.0000 0.0000 0.0000 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9069 0.8927 0.8997 382
macro avg 0.7429 0.6375 0.6734 382
weighted avg 0.9043 0.8927 0.8979 382
2023-10-17 08:34:36,485 ----------------------------------------------------------------------------------------------------