stefan-it's picture
Upload folder using huggingface_hub
0effc4a
2023-10-18 01:43:53,237 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,239 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 01:43:53,239 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,240 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-18 01:43:53,240 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,240 Train: 20847 sentences
2023-10-18 01:43:53,240 (train_with_dev=False, train_with_test=False)
2023-10-18 01:43:53,240 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,240 Training Params:
2023-10-18 01:43:53,240 - learning_rate: "3e-05"
2023-10-18 01:43:53,240 - mini_batch_size: "8"
2023-10-18 01:43:53,240 - max_epochs: "10"
2023-10-18 01:43:53,240 - shuffle: "True"
2023-10-18 01:43:53,240 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,240 Plugins:
2023-10-18 01:43:53,240 - TensorboardLogger
2023-10-18 01:43:53,240 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 01:43:53,240 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,241 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 01:43:53,241 - metric: "('micro avg', 'f1-score')"
2023-10-18 01:43:53,241 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,241 Computation:
2023-10-18 01:43:53,241 - compute on device: cuda:0
2023-10-18 01:43:53,241 - embedding storage: none
2023-10-18 01:43:53,241 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,241 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 01:43:53,241 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,241 ----------------------------------------------------------------------------------------------------
2023-10-18 01:43:53,241 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 01:44:18,435 epoch 1 - iter 260/2606 - loss 2.36689502 - time (sec): 25.19 - samples/sec: 1402.75 - lr: 0.000003 - momentum: 0.000000
2023-10-18 01:44:43,340 epoch 1 - iter 520/2606 - loss 1.41431550 - time (sec): 50.10 - samples/sec: 1412.63 - lr: 0.000006 - momentum: 0.000000
2023-10-18 01:45:09,326 epoch 1 - iter 780/2606 - loss 1.03458455 - time (sec): 76.08 - samples/sec: 1437.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 01:45:34,659 epoch 1 - iter 1040/2606 - loss 0.84351166 - time (sec): 101.42 - samples/sec: 1445.24 - lr: 0.000012 - momentum: 0.000000
2023-10-18 01:46:00,527 epoch 1 - iter 1300/2606 - loss 0.71609535 - time (sec): 127.28 - samples/sec: 1448.74 - lr: 0.000015 - momentum: 0.000000
2023-10-18 01:46:26,321 epoch 1 - iter 1560/2606 - loss 0.63458616 - time (sec): 153.08 - samples/sec: 1455.02 - lr: 0.000018 - momentum: 0.000000
2023-10-18 01:46:51,083 epoch 1 - iter 1820/2606 - loss 0.57884585 - time (sec): 177.84 - samples/sec: 1450.11 - lr: 0.000021 - momentum: 0.000000
2023-10-18 01:47:17,387 epoch 1 - iter 2080/2606 - loss 0.53389468 - time (sec): 204.14 - samples/sec: 1436.46 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:47:45,475 epoch 1 - iter 2340/2606 - loss 0.49351436 - time (sec): 232.23 - samples/sec: 1427.48 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:48:10,801 epoch 1 - iter 2600/2606 - loss 0.46523756 - time (sec): 257.56 - samples/sec: 1422.36 - lr: 0.000030 - momentum: 0.000000
2023-10-18 01:48:11,516 ----------------------------------------------------------------------------------------------------
2023-10-18 01:48:11,516 EPOCH 1 done: loss 0.4640 - lr: 0.000030
2023-10-18 01:48:18,837 DEV : loss 0.1177271232008934 - f1-score (micro avg) 0.3024
2023-10-18 01:48:18,910 saving best model
2023-10-18 01:48:19,533 ----------------------------------------------------------------------------------------------------
2023-10-18 01:48:46,567 epoch 2 - iter 260/2606 - loss 0.17757391 - time (sec): 27.03 - samples/sec: 1372.47 - lr: 0.000030 - momentum: 0.000000
2023-10-18 01:49:12,975 epoch 2 - iter 520/2606 - loss 0.17005093 - time (sec): 53.44 - samples/sec: 1336.99 - lr: 0.000029 - momentum: 0.000000
2023-10-18 01:49:40,277 epoch 2 - iter 780/2606 - loss 0.17240349 - time (sec): 80.74 - samples/sec: 1329.40 - lr: 0.000029 - momentum: 0.000000
2023-10-18 01:50:07,183 epoch 2 - iter 1040/2606 - loss 0.17209944 - time (sec): 107.65 - samples/sec: 1346.71 - lr: 0.000029 - momentum: 0.000000
2023-10-18 01:50:32,998 epoch 2 - iter 1300/2606 - loss 0.16898663 - time (sec): 133.46 - samples/sec: 1341.41 - lr: 0.000028 - momentum: 0.000000
2023-10-18 01:50:59,584 epoch 2 - iter 1560/2606 - loss 0.16808612 - time (sec): 160.05 - samples/sec: 1347.14 - lr: 0.000028 - momentum: 0.000000
2023-10-18 01:51:26,109 epoch 2 - iter 1820/2606 - loss 0.16649873 - time (sec): 186.57 - samples/sec: 1359.49 - lr: 0.000028 - momentum: 0.000000
2023-10-18 01:51:53,004 epoch 2 - iter 2080/2606 - loss 0.16394020 - time (sec): 213.47 - samples/sec: 1364.07 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:52:19,150 epoch 2 - iter 2340/2606 - loss 0.16137184 - time (sec): 239.62 - samples/sec: 1364.38 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:52:46,963 epoch 2 - iter 2600/2606 - loss 0.15760333 - time (sec): 267.43 - samples/sec: 1368.84 - lr: 0.000027 - momentum: 0.000000
2023-10-18 01:52:47,781 ----------------------------------------------------------------------------------------------------
2023-10-18 01:52:47,781 EPOCH 2 done: loss 0.1574 - lr: 0.000027
2023-10-18 01:52:59,427 DEV : loss 0.11865425109863281 - f1-score (micro avg) 0.3154
2023-10-18 01:52:59,488 saving best model
2023-10-18 01:53:00,955 ----------------------------------------------------------------------------------------------------
2023-10-18 01:53:27,979 epoch 3 - iter 260/2606 - loss 0.11415869 - time (sec): 27.02 - samples/sec: 1380.83 - lr: 0.000026 - momentum: 0.000000
2023-10-18 01:53:55,798 epoch 3 - iter 520/2606 - loss 0.10995719 - time (sec): 54.84 - samples/sec: 1391.02 - lr: 0.000026 - momentum: 0.000000
2023-10-18 01:54:22,508 epoch 3 - iter 780/2606 - loss 0.10931204 - time (sec): 81.55 - samples/sec: 1393.25 - lr: 0.000026 - momentum: 0.000000
2023-10-18 01:54:49,765 epoch 3 - iter 1040/2606 - loss 0.10761017 - time (sec): 108.81 - samples/sec: 1394.64 - lr: 0.000025 - momentum: 0.000000
2023-10-18 01:55:16,047 epoch 3 - iter 1300/2606 - loss 0.10662181 - time (sec): 135.09 - samples/sec: 1384.16 - lr: 0.000025 - momentum: 0.000000
2023-10-18 01:55:43,603 epoch 3 - iter 1560/2606 - loss 0.10877184 - time (sec): 162.64 - samples/sec: 1378.52 - lr: 0.000025 - momentum: 0.000000
2023-10-18 01:56:10,151 epoch 3 - iter 1820/2606 - loss 0.10600747 - time (sec): 189.19 - samples/sec: 1373.57 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:56:38,052 epoch 3 - iter 2080/2606 - loss 0.10590194 - time (sec): 217.09 - samples/sec: 1364.73 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:57:04,290 epoch 3 - iter 2340/2606 - loss 0.10934178 - time (sec): 243.33 - samples/sec: 1362.27 - lr: 0.000024 - momentum: 0.000000
2023-10-18 01:57:31,738 epoch 3 - iter 2600/2606 - loss 0.11013152 - time (sec): 270.78 - samples/sec: 1353.21 - lr: 0.000023 - momentum: 0.000000
2023-10-18 01:57:32,443 ----------------------------------------------------------------------------------------------------
2023-10-18 01:57:32,444 EPOCH 3 done: loss 0.1102 - lr: 0.000023
2023-10-18 01:57:43,934 DEV : loss 0.1810266077518463 - f1-score (micro avg) 0.3569
2023-10-18 01:57:43,985 saving best model
2023-10-18 01:57:45,351 ----------------------------------------------------------------------------------------------------
2023-10-18 01:58:12,270 epoch 4 - iter 260/2606 - loss 0.06286773 - time (sec): 26.91 - samples/sec: 1409.03 - lr: 0.000023 - momentum: 0.000000
2023-10-18 01:58:38,471 epoch 4 - iter 520/2606 - loss 0.07418588 - time (sec): 53.12 - samples/sec: 1402.34 - lr: 0.000023 - momentum: 0.000000
2023-10-18 01:59:05,109 epoch 4 - iter 780/2606 - loss 0.07626348 - time (sec): 79.75 - samples/sec: 1387.59 - lr: 0.000022 - momentum: 0.000000
2023-10-18 01:59:31,143 epoch 4 - iter 1040/2606 - loss 0.07797779 - time (sec): 105.79 - samples/sec: 1363.81 - lr: 0.000022 - momentum: 0.000000
2023-10-18 01:59:57,737 epoch 4 - iter 1300/2606 - loss 0.07611310 - time (sec): 132.38 - samples/sec: 1360.63 - lr: 0.000022 - momentum: 0.000000
2023-10-18 02:00:25,509 epoch 4 - iter 1560/2606 - loss 0.07528087 - time (sec): 160.15 - samples/sec: 1369.44 - lr: 0.000021 - momentum: 0.000000
2023-10-18 02:00:51,422 epoch 4 - iter 1820/2606 - loss 0.07718542 - time (sec): 186.07 - samples/sec: 1363.86 - lr: 0.000021 - momentum: 0.000000
2023-10-18 02:01:19,814 epoch 4 - iter 2080/2606 - loss 0.07906334 - time (sec): 214.46 - samples/sec: 1369.68 - lr: 0.000021 - momentum: 0.000000
2023-10-18 02:01:46,681 epoch 4 - iter 2340/2606 - loss 0.07836038 - time (sec): 241.33 - samples/sec: 1362.67 - lr: 0.000020 - momentum: 0.000000
2023-10-18 02:02:14,630 epoch 4 - iter 2600/2606 - loss 0.08153752 - time (sec): 269.27 - samples/sec: 1361.88 - lr: 0.000020 - momentum: 0.000000
2023-10-18 02:02:15,210 ----------------------------------------------------------------------------------------------------
2023-10-18 02:02:15,211 EPOCH 4 done: loss 0.0816 - lr: 0.000020
2023-10-18 02:02:26,482 DEV : loss 0.23630264401435852 - f1-score (micro avg) 0.3444
2023-10-18 02:02:26,531 ----------------------------------------------------------------------------------------------------
2023-10-18 02:02:54,043 epoch 5 - iter 260/2606 - loss 0.04197353 - time (sec): 27.51 - samples/sec: 1369.63 - lr: 0.000020 - momentum: 0.000000
2023-10-18 02:03:22,392 epoch 5 - iter 520/2606 - loss 0.05052783 - time (sec): 55.86 - samples/sec: 1344.58 - lr: 0.000019 - momentum: 0.000000
2023-10-18 02:03:51,375 epoch 5 - iter 780/2606 - loss 0.05148158 - time (sec): 84.84 - samples/sec: 1357.95 - lr: 0.000019 - momentum: 0.000000
2023-10-18 02:04:18,638 epoch 5 - iter 1040/2606 - loss 0.05272903 - time (sec): 112.10 - samples/sec: 1367.58 - lr: 0.000019 - momentum: 0.000000
2023-10-18 02:04:44,954 epoch 5 - iter 1300/2606 - loss 0.05388151 - time (sec): 138.42 - samples/sec: 1372.47 - lr: 0.000018 - momentum: 0.000000
2023-10-18 02:05:11,302 epoch 5 - iter 1560/2606 - loss 0.05517232 - time (sec): 164.77 - samples/sec: 1355.79 - lr: 0.000018 - momentum: 0.000000
2023-10-18 02:05:36,871 epoch 5 - iter 1820/2606 - loss 0.05500689 - time (sec): 190.34 - samples/sec: 1347.23 - lr: 0.000018 - momentum: 0.000000
2023-10-18 02:06:03,512 epoch 5 - iter 2080/2606 - loss 0.05753746 - time (sec): 216.98 - samples/sec: 1355.14 - lr: 0.000017 - momentum: 0.000000
2023-10-18 02:06:29,577 epoch 5 - iter 2340/2606 - loss 0.05760071 - time (sec): 243.04 - samples/sec: 1360.89 - lr: 0.000017 - momentum: 0.000000
2023-10-18 02:06:56,947 epoch 5 - iter 2600/2606 - loss 0.05694245 - time (sec): 270.41 - samples/sec: 1356.56 - lr: 0.000017 - momentum: 0.000000
2023-10-18 02:06:57,501 ----------------------------------------------------------------------------------------------------
2023-10-18 02:06:57,501 EPOCH 5 done: loss 0.0569 - lr: 0.000017
2023-10-18 02:07:08,168 DEV : loss 0.3867754638195038 - f1-score (micro avg) 0.3416
2023-10-18 02:07:08,226 ----------------------------------------------------------------------------------------------------
2023-10-18 02:07:36,593 epoch 6 - iter 260/2606 - loss 0.03662114 - time (sec): 28.36 - samples/sec: 1194.39 - lr: 0.000016 - momentum: 0.000000
2023-10-18 02:08:02,372 epoch 6 - iter 520/2606 - loss 0.04435054 - time (sec): 54.14 - samples/sec: 1270.33 - lr: 0.000016 - momentum: 0.000000
2023-10-18 02:08:29,213 epoch 6 - iter 780/2606 - loss 0.04244809 - time (sec): 80.98 - samples/sec: 1305.98 - lr: 0.000016 - momentum: 0.000000
2023-10-18 02:08:56,728 epoch 6 - iter 1040/2606 - loss 0.04067286 - time (sec): 108.50 - samples/sec: 1345.19 - lr: 0.000015 - momentum: 0.000000
2023-10-18 02:09:22,409 epoch 6 - iter 1300/2606 - loss 0.04266422 - time (sec): 134.18 - samples/sec: 1355.88 - lr: 0.000015 - momentum: 0.000000
2023-10-18 02:09:48,520 epoch 6 - iter 1560/2606 - loss 0.04213331 - time (sec): 160.29 - samples/sec: 1351.85 - lr: 0.000015 - momentum: 0.000000
2023-10-18 02:10:15,112 epoch 6 - iter 1820/2606 - loss 0.04413458 - time (sec): 186.88 - samples/sec: 1357.74 - lr: 0.000014 - momentum: 0.000000
2023-10-18 02:10:42,144 epoch 6 - iter 2080/2606 - loss 0.04322785 - time (sec): 213.92 - samples/sec: 1361.22 - lr: 0.000014 - momentum: 0.000000
2023-10-18 02:11:09,190 epoch 6 - iter 2340/2606 - loss 0.04282381 - time (sec): 240.96 - samples/sec: 1370.22 - lr: 0.000014 - momentum: 0.000000
2023-10-18 02:11:36,211 epoch 6 - iter 2600/2606 - loss 0.04215471 - time (sec): 267.98 - samples/sec: 1367.40 - lr: 0.000013 - momentum: 0.000000
2023-10-18 02:11:36,800 ----------------------------------------------------------------------------------------------------
2023-10-18 02:11:36,801 EPOCH 6 done: loss 0.0422 - lr: 0.000013
2023-10-18 02:11:47,848 DEV : loss 0.5211431384086609 - f1-score (micro avg) 0.344
2023-10-18 02:11:47,905 ----------------------------------------------------------------------------------------------------
2023-10-18 02:12:14,952 epoch 7 - iter 260/2606 - loss 0.03083105 - time (sec): 27.05 - samples/sec: 1292.84 - lr: 0.000013 - momentum: 0.000000
2023-10-18 02:12:43,024 epoch 7 - iter 520/2606 - loss 0.02863032 - time (sec): 55.12 - samples/sec: 1335.23 - lr: 0.000013 - momentum: 0.000000
2023-10-18 02:13:09,861 epoch 7 - iter 780/2606 - loss 0.02678146 - time (sec): 81.95 - samples/sec: 1334.01 - lr: 0.000012 - momentum: 0.000000
2023-10-18 02:13:38,977 epoch 7 - iter 1040/2606 - loss 0.02706620 - time (sec): 111.07 - samples/sec: 1329.10 - lr: 0.000012 - momentum: 0.000000
2023-10-18 02:14:07,551 epoch 7 - iter 1300/2606 - loss 0.02681222 - time (sec): 139.64 - samples/sec: 1328.93 - lr: 0.000012 - momentum: 0.000000
2023-10-18 02:14:34,037 epoch 7 - iter 1560/2606 - loss 0.02757018 - time (sec): 166.13 - samples/sec: 1335.74 - lr: 0.000011 - momentum: 0.000000
2023-10-18 02:15:01,107 epoch 7 - iter 1820/2606 - loss 0.02788944 - time (sec): 193.20 - samples/sec: 1339.27 - lr: 0.000011 - momentum: 0.000000
2023-10-18 02:15:27,430 epoch 7 - iter 2080/2606 - loss 0.02802776 - time (sec): 219.52 - samples/sec: 1330.54 - lr: 0.000011 - momentum: 0.000000
2023-10-18 02:15:54,014 epoch 7 - iter 2340/2606 - loss 0.02810319 - time (sec): 246.11 - samples/sec: 1338.49 - lr: 0.000010 - momentum: 0.000000
2023-10-18 02:16:20,782 epoch 7 - iter 2600/2606 - loss 0.02820370 - time (sec): 272.88 - samples/sec: 1342.64 - lr: 0.000010 - momentum: 0.000000
2023-10-18 02:16:21,376 ----------------------------------------------------------------------------------------------------
2023-10-18 02:16:21,377 EPOCH 7 done: loss 0.0281 - lr: 0.000010
2023-10-18 02:16:32,117 DEV : loss 0.4535387456417084 - f1-score (micro avg) 0.3722
2023-10-18 02:16:32,179 saving best model
2023-10-18 02:16:33,534 ----------------------------------------------------------------------------------------------------
2023-10-18 02:16:59,035 epoch 8 - iter 260/2606 - loss 0.01351293 - time (sec): 25.50 - samples/sec: 1354.64 - lr: 0.000010 - momentum: 0.000000
2023-10-18 02:17:25,338 epoch 8 - iter 520/2606 - loss 0.01440588 - time (sec): 51.80 - samples/sec: 1367.77 - lr: 0.000009 - momentum: 0.000000
2023-10-18 02:17:51,800 epoch 8 - iter 780/2606 - loss 0.01923460 - time (sec): 78.26 - samples/sec: 1356.63 - lr: 0.000009 - momentum: 0.000000
2023-10-18 02:18:19,236 epoch 8 - iter 1040/2606 - loss 0.01963510 - time (sec): 105.70 - samples/sec: 1357.69 - lr: 0.000009 - momentum: 0.000000
2023-10-18 02:18:45,434 epoch 8 - iter 1300/2606 - loss 0.02140645 - time (sec): 131.89 - samples/sec: 1361.14 - lr: 0.000008 - momentum: 0.000000
2023-10-18 02:19:12,901 epoch 8 - iter 1560/2606 - loss 0.02057017 - time (sec): 159.36 - samples/sec: 1361.37 - lr: 0.000008 - momentum: 0.000000
2023-10-18 02:19:40,242 epoch 8 - iter 1820/2606 - loss 0.02054834 - time (sec): 186.70 - samples/sec: 1354.93 - lr: 0.000008 - momentum: 0.000000
2023-10-18 02:20:09,396 epoch 8 - iter 2080/2606 - loss 0.02124126 - time (sec): 215.86 - samples/sec: 1362.95 - lr: 0.000007 - momentum: 0.000000
2023-10-18 02:20:36,131 epoch 8 - iter 2340/2606 - loss 0.02104593 - time (sec): 242.59 - samples/sec: 1363.92 - lr: 0.000007 - momentum: 0.000000
2023-10-18 02:21:02,359 epoch 8 - iter 2600/2606 - loss 0.02107554 - time (sec): 268.82 - samples/sec: 1364.79 - lr: 0.000007 - momentum: 0.000000
2023-10-18 02:21:02,887 ----------------------------------------------------------------------------------------------------
2023-10-18 02:21:02,888 EPOCH 8 done: loss 0.0211 - lr: 0.000007
2023-10-18 02:21:13,586 DEV : loss 0.41924044489860535 - f1-score (micro avg) 0.3891
2023-10-18 02:21:13,647 saving best model
2023-10-18 02:21:15,009 ----------------------------------------------------------------------------------------------------
2023-10-18 02:21:42,056 epoch 9 - iter 260/2606 - loss 0.01266909 - time (sec): 27.04 - samples/sec: 1378.88 - lr: 0.000006 - momentum: 0.000000
2023-10-18 02:22:08,542 epoch 9 - iter 520/2606 - loss 0.01470607 - time (sec): 53.53 - samples/sec: 1377.46 - lr: 0.000006 - momentum: 0.000000
2023-10-18 02:22:36,336 epoch 9 - iter 780/2606 - loss 0.01543376 - time (sec): 81.32 - samples/sec: 1362.59 - lr: 0.000006 - momentum: 0.000000
2023-10-18 02:23:06,222 epoch 9 - iter 1040/2606 - loss 0.01493246 - time (sec): 111.21 - samples/sec: 1348.27 - lr: 0.000005 - momentum: 0.000000
2023-10-18 02:23:32,899 epoch 9 - iter 1300/2606 - loss 0.01541657 - time (sec): 137.89 - samples/sec: 1353.01 - lr: 0.000005 - momentum: 0.000000
2023-10-18 02:23:59,861 epoch 9 - iter 1560/2606 - loss 0.01531374 - time (sec): 164.85 - samples/sec: 1334.77 - lr: 0.000005 - momentum: 0.000000
2023-10-18 02:24:28,273 epoch 9 - iter 1820/2606 - loss 0.01476571 - time (sec): 193.26 - samples/sec: 1335.13 - lr: 0.000004 - momentum: 0.000000
2023-10-18 02:24:56,649 epoch 9 - iter 2080/2606 - loss 0.01475419 - time (sec): 221.64 - samples/sec: 1336.90 - lr: 0.000004 - momentum: 0.000000
2023-10-18 02:25:23,490 epoch 9 - iter 2340/2606 - loss 0.01470457 - time (sec): 248.48 - samples/sec: 1327.57 - lr: 0.000004 - momentum: 0.000000
2023-10-18 02:25:51,726 epoch 9 - iter 2600/2606 - loss 0.01431937 - time (sec): 276.71 - samples/sec: 1325.29 - lr: 0.000003 - momentum: 0.000000
2023-10-18 02:25:52,244 ----------------------------------------------------------------------------------------------------
2023-10-18 02:25:52,245 EPOCH 9 done: loss 0.0143 - lr: 0.000003
2023-10-18 02:26:03,159 DEV : loss 0.49826958775520325 - f1-score (micro avg) 0.3768
2023-10-18 02:26:03,231 ----------------------------------------------------------------------------------------------------
2023-10-18 02:26:30,129 epoch 10 - iter 260/2606 - loss 0.01142377 - time (sec): 26.90 - samples/sec: 1394.17 - lr: 0.000003 - momentum: 0.000000
2023-10-18 02:26:56,566 epoch 10 - iter 520/2606 - loss 0.01110790 - time (sec): 53.33 - samples/sec: 1405.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 02:27:25,269 epoch 10 - iter 780/2606 - loss 0.01085325 - time (sec): 82.04 - samples/sec: 1384.10 - lr: 0.000002 - momentum: 0.000000
2023-10-18 02:27:52,684 epoch 10 - iter 1040/2606 - loss 0.01147904 - time (sec): 109.45 - samples/sec: 1371.49 - lr: 0.000002 - momentum: 0.000000
2023-10-18 02:28:20,802 epoch 10 - iter 1300/2606 - loss 0.01133036 - time (sec): 137.57 - samples/sec: 1363.61 - lr: 0.000002 - momentum: 0.000000
2023-10-18 02:28:47,046 epoch 10 - iter 1560/2606 - loss 0.01108696 - time (sec): 163.81 - samples/sec: 1353.21 - lr: 0.000001 - momentum: 0.000000
2023-10-18 02:29:13,623 epoch 10 - iter 1820/2606 - loss 0.01070499 - time (sec): 190.39 - samples/sec: 1351.82 - lr: 0.000001 - momentum: 0.000000
2023-10-18 02:29:40,144 epoch 10 - iter 2080/2606 - loss 0.01026599 - time (sec): 216.91 - samples/sec: 1350.89 - lr: 0.000001 - momentum: 0.000000
2023-10-18 02:30:08,976 epoch 10 - iter 2340/2606 - loss 0.01060714 - time (sec): 245.74 - samples/sec: 1346.13 - lr: 0.000000 - momentum: 0.000000
2023-10-18 02:30:35,528 epoch 10 - iter 2600/2606 - loss 0.01038499 - time (sec): 272.29 - samples/sec: 1346.48 - lr: 0.000000 - momentum: 0.000000
2023-10-18 02:30:36,138 ----------------------------------------------------------------------------------------------------
2023-10-18 02:30:36,139 EPOCH 10 done: loss 0.0104 - lr: 0.000000
2023-10-18 02:30:47,830 DEV : loss 0.5188506245613098 - f1-score (micro avg) 0.3779
2023-10-18 02:30:48,408 ----------------------------------------------------------------------------------------------------
2023-10-18 02:30:48,410 Loading model from best epoch ...
2023-10-18 02:30:50,730 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-18 02:31:10,198
Results:
- F-score (micro) 0.4525
- F-score (macro) 0.3217
- Accuracy 0.2967
By class:
precision recall f1-score support
LOC 0.4714 0.5099 0.4899 1214
PER 0.4279 0.4740 0.4498 808
ORG 0.3220 0.3768 0.3473 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4321 0.4749 0.4525 2390
macro avg 0.3054 0.3402 0.3217 2390
weighted avg 0.4317 0.4749 0.4522 2390
2023-10-18 02:31:10,198 ----------------------------------------------------------------------------------------------------