stefan-it's picture
Upload folder using huggingface_hub
f8c39f1
2023-10-17 21:41:01,738 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,740 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 21:41:01,740 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,741 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-17 21:41:01,741 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,741 Train: 20847 sentences
2023-10-17 21:41:01,741 (train_with_dev=False, train_with_test=False)
2023-10-17 21:41:01,741 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,741 Training Params:
2023-10-17 21:41:01,741 - learning_rate: "3e-05"
2023-10-17 21:41:01,741 - mini_batch_size: "8"
2023-10-17 21:41:01,741 - max_epochs: "10"
2023-10-17 21:41:01,741 - shuffle: "True"
2023-10-17 21:41:01,741 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,741 Plugins:
2023-10-17 21:41:01,741 - TensorboardLogger
2023-10-17 21:41:01,742 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 21:41:01,742 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,742 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 21:41:01,742 - metric: "('micro avg', 'f1-score')"
2023-10-17 21:41:01,742 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,742 Computation:
2023-10-17 21:41:01,742 - compute on device: cuda:0
2023-10-17 21:41:01,742 - embedding storage: none
2023-10-17 21:41:01,742 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,742 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 21:41:01,742 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,742 ----------------------------------------------------------------------------------------------------
2023-10-17 21:41:01,742 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 21:41:30,337 epoch 1 - iter 260/2606 - loss 2.41881760 - time (sec): 28.59 - samples/sec: 1339.66 - lr: 0.000003 - momentum: 0.000000
2023-10-17 21:41:58,087 epoch 1 - iter 520/2606 - loss 1.45243831 - time (sec): 56.34 - samples/sec: 1322.89 - lr: 0.000006 - momentum: 0.000000
2023-10-17 21:42:26,584 epoch 1 - iter 780/2606 - loss 1.09141288 - time (sec): 84.84 - samples/sec: 1291.54 - lr: 0.000009 - momentum: 0.000000
2023-10-17 21:42:54,874 epoch 1 - iter 1040/2606 - loss 0.88178923 - time (sec): 113.13 - samples/sec: 1301.26 - lr: 0.000012 - momentum: 0.000000
2023-10-17 21:43:21,592 epoch 1 - iter 1300/2606 - loss 0.76182440 - time (sec): 139.85 - samples/sec: 1304.18 - lr: 0.000015 - momentum: 0.000000
2023-10-17 21:43:49,305 epoch 1 - iter 1560/2606 - loss 0.67181695 - time (sec): 167.56 - samples/sec: 1302.49 - lr: 0.000018 - momentum: 0.000000
2023-10-17 21:44:17,308 epoch 1 - iter 1820/2606 - loss 0.60318561 - time (sec): 195.56 - samples/sec: 1315.48 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:44:45,117 epoch 1 - iter 2080/2606 - loss 0.55727278 - time (sec): 223.37 - samples/sec: 1321.33 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:45:13,289 epoch 1 - iter 2340/2606 - loss 0.51688287 - time (sec): 251.54 - samples/sec: 1318.03 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:45:40,267 epoch 1 - iter 2600/2606 - loss 0.48599856 - time (sec): 278.52 - samples/sec: 1315.17 - lr: 0.000030 - momentum: 0.000000
2023-10-17 21:45:40,928 ----------------------------------------------------------------------------------------------------
2023-10-17 21:45:40,928 EPOCH 1 done: loss 0.4850 - lr: 0.000030
2023-10-17 21:45:48,508 DEV : loss 0.14276579022407532 - f1-score (micro avg) 0.3183
2023-10-17 21:45:48,562 saving best model
2023-10-17 21:45:49,113 ----------------------------------------------------------------------------------------------------
2023-10-17 21:46:16,813 epoch 2 - iter 260/2606 - loss 0.17577471 - time (sec): 27.70 - samples/sec: 1306.48 - lr: 0.000030 - momentum: 0.000000
2023-10-17 21:46:43,370 epoch 2 - iter 520/2606 - loss 0.16721643 - time (sec): 54.25 - samples/sec: 1340.06 - lr: 0.000029 - momentum: 0.000000
2023-10-17 21:47:08,929 epoch 2 - iter 780/2606 - loss 0.16285204 - time (sec): 79.81 - samples/sec: 1361.90 - lr: 0.000029 - momentum: 0.000000
2023-10-17 21:47:33,597 epoch 2 - iter 1040/2606 - loss 0.16796041 - time (sec): 104.48 - samples/sec: 1376.52 - lr: 0.000029 - momentum: 0.000000
2023-10-17 21:48:03,270 epoch 2 - iter 1300/2606 - loss 0.16172000 - time (sec): 134.15 - samples/sec: 1353.77 - lr: 0.000028 - momentum: 0.000000
2023-10-17 21:48:32,798 epoch 2 - iter 1560/2606 - loss 0.16200037 - time (sec): 163.68 - samples/sec: 1331.55 - lr: 0.000028 - momentum: 0.000000
2023-10-17 21:48:59,085 epoch 2 - iter 1820/2606 - loss 0.15970481 - time (sec): 189.97 - samples/sec: 1328.64 - lr: 0.000028 - momentum: 0.000000
2023-10-17 21:49:27,191 epoch 2 - iter 2080/2606 - loss 0.15777567 - time (sec): 218.08 - samples/sec: 1341.26 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:49:55,624 epoch 2 - iter 2340/2606 - loss 0.15570657 - time (sec): 246.51 - samples/sec: 1346.43 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:50:21,968 epoch 2 - iter 2600/2606 - loss 0.15259905 - time (sec): 272.85 - samples/sec: 1343.85 - lr: 0.000027 - momentum: 0.000000
2023-10-17 21:50:22,589 ----------------------------------------------------------------------------------------------------
2023-10-17 21:50:22,589 EPOCH 2 done: loss 0.1526 - lr: 0.000027
2023-10-17 21:50:34,223 DEV : loss 0.17080335319042206 - f1-score (micro avg) 0.3148
2023-10-17 21:50:34,279 ----------------------------------------------------------------------------------------------------
2023-10-17 21:51:00,649 epoch 3 - iter 260/2606 - loss 0.11935628 - time (sec): 26.37 - samples/sec: 1368.32 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:51:27,685 epoch 3 - iter 520/2606 - loss 0.11145962 - time (sec): 53.40 - samples/sec: 1374.95 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:51:53,191 epoch 3 - iter 780/2606 - loss 0.10854384 - time (sec): 78.91 - samples/sec: 1367.28 - lr: 0.000026 - momentum: 0.000000
2023-10-17 21:52:19,738 epoch 3 - iter 1040/2606 - loss 0.11018127 - time (sec): 105.46 - samples/sec: 1390.48 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:52:46,253 epoch 3 - iter 1300/2606 - loss 0.11016333 - time (sec): 131.97 - samples/sec: 1384.32 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:53:12,837 epoch 3 - iter 1560/2606 - loss 0.10902383 - time (sec): 158.56 - samples/sec: 1382.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 21:53:39,943 epoch 3 - iter 1820/2606 - loss 0.10842064 - time (sec): 185.66 - samples/sec: 1386.69 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:54:06,490 epoch 3 - iter 2080/2606 - loss 0.10894065 - time (sec): 212.21 - samples/sec: 1385.76 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:54:32,765 epoch 3 - iter 2340/2606 - loss 0.10851408 - time (sec): 238.48 - samples/sec: 1378.78 - lr: 0.000024 - momentum: 0.000000
2023-10-17 21:55:00,409 epoch 3 - iter 2600/2606 - loss 0.10879450 - time (sec): 266.13 - samples/sec: 1376.64 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:55:01,017 ----------------------------------------------------------------------------------------------------
2023-10-17 21:55:01,017 EPOCH 3 done: loss 0.1088 - lr: 0.000023
2023-10-17 21:55:13,477 DEV : loss 0.15336216986179352 - f1-score (micro avg) 0.3868
2023-10-17 21:55:13,545 saving best model
2023-10-17 21:55:14,189 ----------------------------------------------------------------------------------------------------
2023-10-17 21:55:42,008 epoch 4 - iter 260/2606 - loss 0.08171917 - time (sec): 27.82 - samples/sec: 1339.60 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:56:08,740 epoch 4 - iter 520/2606 - loss 0.07405963 - time (sec): 54.55 - samples/sec: 1348.05 - lr: 0.000023 - momentum: 0.000000
2023-10-17 21:56:36,186 epoch 4 - iter 780/2606 - loss 0.07507318 - time (sec): 81.99 - samples/sec: 1355.98 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:57:02,099 epoch 4 - iter 1040/2606 - loss 0.07869775 - time (sec): 107.91 - samples/sec: 1354.92 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:57:28,178 epoch 4 - iter 1300/2606 - loss 0.08300475 - time (sec): 133.99 - samples/sec: 1359.02 - lr: 0.000022 - momentum: 0.000000
2023-10-17 21:57:55,360 epoch 4 - iter 1560/2606 - loss 0.08367761 - time (sec): 161.17 - samples/sec: 1355.40 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:58:21,927 epoch 4 - iter 1820/2606 - loss 0.08344334 - time (sec): 187.74 - samples/sec: 1360.67 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:58:49,443 epoch 4 - iter 2080/2606 - loss 0.08237646 - time (sec): 215.25 - samples/sec: 1364.99 - lr: 0.000021 - momentum: 0.000000
2023-10-17 21:59:17,124 epoch 4 - iter 2340/2606 - loss 0.08147344 - time (sec): 242.93 - samples/sec: 1357.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 21:59:44,398 epoch 4 - iter 2600/2606 - loss 0.08031963 - time (sec): 270.21 - samples/sec: 1355.48 - lr: 0.000020 - momentum: 0.000000
2023-10-17 21:59:45,164 ----------------------------------------------------------------------------------------------------
2023-10-17 21:59:45,164 EPOCH 4 done: loss 0.0802 - lr: 0.000020
2023-10-17 21:59:57,012 DEV : loss 0.2895336449146271 - f1-score (micro avg) 0.382
2023-10-17 21:59:57,071 ----------------------------------------------------------------------------------------------------
2023-10-17 22:00:24,322 epoch 5 - iter 260/2606 - loss 0.03463138 - time (sec): 27.25 - samples/sec: 1320.83 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:00:51,439 epoch 5 - iter 520/2606 - loss 0.04332343 - time (sec): 54.37 - samples/sec: 1373.95 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:01:19,415 epoch 5 - iter 780/2606 - loss 0.04395028 - time (sec): 82.34 - samples/sec: 1382.89 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:01:46,086 epoch 5 - iter 1040/2606 - loss 0.04564615 - time (sec): 109.01 - samples/sec: 1367.83 - lr: 0.000019 - momentum: 0.000000
2023-10-17 22:02:14,157 epoch 5 - iter 1300/2606 - loss 0.05201596 - time (sec): 137.08 - samples/sec: 1355.49 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:02:42,089 epoch 5 - iter 1560/2606 - loss 0.05280423 - time (sec): 165.02 - samples/sec: 1356.56 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:03:09,017 epoch 5 - iter 1820/2606 - loss 0.05187602 - time (sec): 191.94 - samples/sec: 1356.32 - lr: 0.000018 - momentum: 0.000000
2023-10-17 22:03:37,004 epoch 5 - iter 2080/2606 - loss 0.05433704 - time (sec): 219.93 - samples/sec: 1351.69 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:04:04,460 epoch 5 - iter 2340/2606 - loss 0.05417178 - time (sec): 247.39 - samples/sec: 1345.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:04:30,839 epoch 5 - iter 2600/2606 - loss 0.05457916 - time (sec): 273.77 - samples/sec: 1339.43 - lr: 0.000017 - momentum: 0.000000
2023-10-17 22:04:31,375 ----------------------------------------------------------------------------------------------------
2023-10-17 22:04:31,376 EPOCH 5 done: loss 0.0549 - lr: 0.000017
2023-10-17 22:04:43,480 DEV : loss 0.2547551393508911 - f1-score (micro avg) 0.4078
2023-10-17 22:04:43,539 saving best model
2023-10-17 22:04:45,029 ----------------------------------------------------------------------------------------------------
2023-10-17 22:05:12,277 epoch 6 - iter 260/2606 - loss 0.04878224 - time (sec): 27.24 - samples/sec: 1405.82 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:05:39,489 epoch 6 - iter 520/2606 - loss 0.04226937 - time (sec): 54.46 - samples/sec: 1361.63 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:06:07,668 epoch 6 - iter 780/2606 - loss 0.04025396 - time (sec): 82.63 - samples/sec: 1361.16 - lr: 0.000016 - momentum: 0.000000
2023-10-17 22:06:36,370 epoch 6 - iter 1040/2606 - loss 0.04013309 - time (sec): 111.34 - samples/sec: 1351.86 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:07:03,268 epoch 6 - iter 1300/2606 - loss 0.04108149 - time (sec): 138.23 - samples/sec: 1347.18 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:07:30,138 epoch 6 - iter 1560/2606 - loss 0.04185494 - time (sec): 165.10 - samples/sec: 1328.03 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:07:58,967 epoch 6 - iter 1820/2606 - loss 0.04144739 - time (sec): 193.93 - samples/sec: 1311.99 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:08:29,097 epoch 6 - iter 2080/2606 - loss 0.04073086 - time (sec): 224.06 - samples/sec: 1297.46 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:08:58,487 epoch 6 - iter 2340/2606 - loss 0.03975708 - time (sec): 253.45 - samples/sec: 1294.51 - lr: 0.000014 - momentum: 0.000000
2023-10-17 22:09:27,224 epoch 6 - iter 2600/2606 - loss 0.03991100 - time (sec): 282.19 - samples/sec: 1298.71 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:09:27,970 ----------------------------------------------------------------------------------------------------
2023-10-17 22:09:27,970 EPOCH 6 done: loss 0.0398 - lr: 0.000013
2023-10-17 22:09:40,547 DEV : loss 0.4081049859523773 - f1-score (micro avg) 0.3431
2023-10-17 22:09:40,600 ----------------------------------------------------------------------------------------------------
2023-10-17 22:10:09,222 epoch 7 - iter 260/2606 - loss 0.02307160 - time (sec): 28.62 - samples/sec: 1321.09 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:10:36,227 epoch 7 - iter 520/2606 - loss 0.02420388 - time (sec): 55.62 - samples/sec: 1331.24 - lr: 0.000013 - momentum: 0.000000
2023-10-17 22:11:02,363 epoch 7 - iter 780/2606 - loss 0.02718461 - time (sec): 81.76 - samples/sec: 1326.55 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:11:30,000 epoch 7 - iter 1040/2606 - loss 0.02714358 - time (sec): 109.40 - samples/sec: 1318.94 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:11:57,405 epoch 7 - iter 1300/2606 - loss 0.02836831 - time (sec): 136.80 - samples/sec: 1323.99 - lr: 0.000012 - momentum: 0.000000
2023-10-17 22:12:24,676 epoch 7 - iter 1560/2606 - loss 0.02987251 - time (sec): 164.07 - samples/sec: 1318.13 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:12:52,857 epoch 7 - iter 1820/2606 - loss 0.03103589 - time (sec): 192.25 - samples/sec: 1317.49 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:13:21,970 epoch 7 - iter 2080/2606 - loss 0.03055732 - time (sec): 221.37 - samples/sec: 1335.62 - lr: 0.000011 - momentum: 0.000000
2023-10-17 22:13:48,595 epoch 7 - iter 2340/2606 - loss 0.03040091 - time (sec): 247.99 - samples/sec: 1331.18 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:14:17,409 epoch 7 - iter 2600/2606 - loss 0.02959375 - time (sec): 276.81 - samples/sec: 1325.42 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:14:17,997 ----------------------------------------------------------------------------------------------------
2023-10-17 22:14:17,998 EPOCH 7 done: loss 0.0296 - lr: 0.000010
2023-10-17 22:14:30,249 DEV : loss 0.3910466134548187 - f1-score (micro avg) 0.3948
2023-10-17 22:14:30,311 ----------------------------------------------------------------------------------------------------
2023-10-17 22:14:59,837 epoch 8 - iter 260/2606 - loss 0.01803369 - time (sec): 29.52 - samples/sec: 1226.36 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:15:29,377 epoch 8 - iter 520/2606 - loss 0.01820685 - time (sec): 59.06 - samples/sec: 1237.49 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:15:57,969 epoch 8 - iter 780/2606 - loss 0.02046211 - time (sec): 87.65 - samples/sec: 1221.52 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:16:26,861 epoch 8 - iter 1040/2606 - loss 0.02034776 - time (sec): 116.55 - samples/sec: 1226.47 - lr: 0.000009 - momentum: 0.000000
2023-10-17 22:16:54,705 epoch 8 - iter 1300/2606 - loss 0.02070951 - time (sec): 144.39 - samples/sec: 1239.18 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:17:24,823 epoch 8 - iter 1560/2606 - loss 0.02093510 - time (sec): 174.51 - samples/sec: 1242.35 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:17:53,158 epoch 8 - iter 1820/2606 - loss 0.02094885 - time (sec): 202.84 - samples/sec: 1268.83 - lr: 0.000008 - momentum: 0.000000
2023-10-17 22:18:19,676 epoch 8 - iter 2080/2606 - loss 0.02190230 - time (sec): 229.36 - samples/sec: 1285.32 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:18:45,912 epoch 8 - iter 2340/2606 - loss 0.02114848 - time (sec): 255.60 - samples/sec: 1289.55 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:19:13,924 epoch 8 - iter 2600/2606 - loss 0.02115871 - time (sec): 283.61 - samples/sec: 1292.71 - lr: 0.000007 - momentum: 0.000000
2023-10-17 22:19:14,479 ----------------------------------------------------------------------------------------------------
2023-10-17 22:19:14,479 EPOCH 8 done: loss 0.0211 - lr: 0.000007
2023-10-17 22:19:26,148 DEV : loss 0.47483888268470764 - f1-score (micro avg) 0.3777
2023-10-17 22:19:26,208 ----------------------------------------------------------------------------------------------------
2023-10-17 22:19:53,934 epoch 9 - iter 260/2606 - loss 0.01536404 - time (sec): 27.72 - samples/sec: 1404.80 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:20:21,780 epoch 9 - iter 520/2606 - loss 0.01572720 - time (sec): 55.57 - samples/sec: 1350.13 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:20:48,491 epoch 9 - iter 780/2606 - loss 0.01641070 - time (sec): 82.28 - samples/sec: 1334.73 - lr: 0.000006 - momentum: 0.000000
2023-10-17 22:21:15,768 epoch 9 - iter 1040/2606 - loss 0.01560086 - time (sec): 109.56 - samples/sec: 1317.73 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:21:43,567 epoch 9 - iter 1300/2606 - loss 0.01474722 - time (sec): 137.36 - samples/sec: 1306.20 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:22:11,099 epoch 9 - iter 1560/2606 - loss 0.01463560 - time (sec): 164.89 - samples/sec: 1315.07 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:22:37,886 epoch 9 - iter 1820/2606 - loss 0.01501095 - time (sec): 191.68 - samples/sec: 1325.15 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:23:06,405 epoch 9 - iter 2080/2606 - loss 0.01553135 - time (sec): 220.19 - samples/sec: 1329.33 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:23:35,222 epoch 9 - iter 2340/2606 - loss 0.01541101 - time (sec): 249.01 - samples/sec: 1325.15 - lr: 0.000004 - momentum: 0.000000
2023-10-17 22:24:03,048 epoch 9 - iter 2600/2606 - loss 0.01570046 - time (sec): 276.84 - samples/sec: 1324.46 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:24:03,604 ----------------------------------------------------------------------------------------------------
2023-10-17 22:24:03,604 EPOCH 9 done: loss 0.0157 - lr: 0.000003
2023-10-17 22:24:14,552 DEV : loss 0.47549542784690857 - f1-score (micro avg) 0.3929
2023-10-17 22:24:14,609 ----------------------------------------------------------------------------------------------------
2023-10-17 22:24:42,909 epoch 10 - iter 260/2606 - loss 0.00798516 - time (sec): 28.30 - samples/sec: 1324.71 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:25:11,407 epoch 10 - iter 520/2606 - loss 0.00933529 - time (sec): 56.79 - samples/sec: 1311.98 - lr: 0.000003 - momentum: 0.000000
2023-10-17 22:25:38,748 epoch 10 - iter 780/2606 - loss 0.01029313 - time (sec): 84.14 - samples/sec: 1299.42 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:26:05,195 epoch 10 - iter 1040/2606 - loss 0.01032349 - time (sec): 110.58 - samples/sec: 1300.06 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:26:32,397 epoch 10 - iter 1300/2606 - loss 0.01015852 - time (sec): 137.79 - samples/sec: 1330.73 - lr: 0.000002 - momentum: 0.000000
2023-10-17 22:26:59,813 epoch 10 - iter 1560/2606 - loss 0.01048293 - time (sec): 165.20 - samples/sec: 1335.02 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:27:27,229 epoch 10 - iter 1820/2606 - loss 0.01032360 - time (sec): 192.62 - samples/sec: 1347.71 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:27:55,194 epoch 10 - iter 2080/2606 - loss 0.00994780 - time (sec): 220.58 - samples/sec: 1339.62 - lr: 0.000001 - momentum: 0.000000
2023-10-17 22:28:21,684 epoch 10 - iter 2340/2606 - loss 0.01011531 - time (sec): 247.07 - samples/sec: 1335.36 - lr: 0.000000 - momentum: 0.000000
2023-10-17 22:28:49,738 epoch 10 - iter 2600/2606 - loss 0.00998687 - time (sec): 275.13 - samples/sec: 1332.80 - lr: 0.000000 - momentum: 0.000000
2023-10-17 22:28:50,269 ----------------------------------------------------------------------------------------------------
2023-10-17 22:28:50,270 EPOCH 10 done: loss 0.0100 - lr: 0.000000
2023-10-17 22:29:01,249 DEV : loss 0.4655146896839142 - f1-score (micro avg) 0.4053
2023-10-17 22:29:01,827 ----------------------------------------------------------------------------------------------------
2023-10-17 22:29:01,829 Loading model from best epoch ...
2023-10-17 22:29:04,090 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 22:29:23,043
Results:
- F-score (micro) 0.42
- F-score (macro) 0.2892
- Accuracy 0.27
By class:
precision recall f1-score support
LOC 0.5235 0.4316 0.4731 1214
PER 0.4428 0.4022 0.4215 808
ORG 0.2582 0.2663 0.2622 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4490 0.3946 0.4200 2390
macro avg 0.3061 0.2750 0.2892 2390
weighted avg 0.4537 0.3946 0.4216 2390
2023-10-17 22:29:23,043 ----------------------------------------------------------------------------------------------------