|
2023-10-18 01:43:53,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,239 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 01:43:53,239 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,240 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-18 01:43:53,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,240 Train: 20847 sentences |
|
2023-10-18 01:43:53,240 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 01:43:53,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,240 Training Params: |
|
2023-10-18 01:43:53,240 - learning_rate: "3e-05" |
|
2023-10-18 01:43:53,240 - mini_batch_size: "8" |
|
2023-10-18 01:43:53,240 - max_epochs: "10" |
|
2023-10-18 01:43:53,240 - shuffle: "True" |
|
2023-10-18 01:43:53,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,240 Plugins: |
|
2023-10-18 01:43:53,240 - TensorboardLogger |
|
2023-10-18 01:43:53,240 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 01:43:53,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,241 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 01:43:53,241 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 01:43:53,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,241 Computation: |
|
2023-10-18 01:43:53,241 - compute on device: cuda:0 |
|
2023-10-18 01:43:53,241 - embedding storage: none |
|
2023-10-18 01:43:53,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,241 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-18 01:43:53,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:43:53,241 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 01:44:18,435 epoch 1 - iter 260/2606 - loss 2.36689502 - time (sec): 25.19 - samples/sec: 1402.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 01:44:43,340 epoch 1 - iter 520/2606 - loss 1.41431550 - time (sec): 50.10 - samples/sec: 1412.63 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 01:45:09,326 epoch 1 - iter 780/2606 - loss 1.03458455 - time (sec): 76.08 - samples/sec: 1437.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 01:45:34,659 epoch 1 - iter 1040/2606 - loss 0.84351166 - time (sec): 101.42 - samples/sec: 1445.24 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 01:46:00,527 epoch 1 - iter 1300/2606 - loss 0.71609535 - time (sec): 127.28 - samples/sec: 1448.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 01:46:26,321 epoch 1 - iter 1560/2606 - loss 0.63458616 - time (sec): 153.08 - samples/sec: 1455.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 01:46:51,083 epoch 1 - iter 1820/2606 - loss 0.57884585 - time (sec): 177.84 - samples/sec: 1450.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 01:47:17,387 epoch 1 - iter 2080/2606 - loss 0.53389468 - time (sec): 204.14 - samples/sec: 1436.46 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 01:47:45,475 epoch 1 - iter 2340/2606 - loss 0.49351436 - time (sec): 232.23 - samples/sec: 1427.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 01:48:10,801 epoch 1 - iter 2600/2606 - loss 0.46523756 - time (sec): 257.56 - samples/sec: 1422.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 01:48:11,516 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:48:11,516 EPOCH 1 done: loss 0.4640 - lr: 0.000030 |
|
2023-10-18 01:48:18,837 DEV : loss 0.1177271232008934 - f1-score (micro avg) 0.3024 |
|
2023-10-18 01:48:18,910 saving best model |
|
2023-10-18 01:48:19,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:48:46,567 epoch 2 - iter 260/2606 - loss 0.17757391 - time (sec): 27.03 - samples/sec: 1372.47 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 01:49:12,975 epoch 2 - iter 520/2606 - loss 0.17005093 - time (sec): 53.44 - samples/sec: 1336.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 01:49:40,277 epoch 2 - iter 780/2606 - loss 0.17240349 - time (sec): 80.74 - samples/sec: 1329.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 01:50:07,183 epoch 2 - iter 1040/2606 - loss 0.17209944 - time (sec): 107.65 - samples/sec: 1346.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 01:50:32,998 epoch 2 - iter 1300/2606 - loss 0.16898663 - time (sec): 133.46 - samples/sec: 1341.41 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 01:50:59,584 epoch 2 - iter 1560/2606 - loss 0.16808612 - time (sec): 160.05 - samples/sec: 1347.14 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 01:51:26,109 epoch 2 - iter 1820/2606 - loss 0.16649873 - time (sec): 186.57 - samples/sec: 1359.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 01:51:53,004 epoch 2 - iter 2080/2606 - loss 0.16394020 - time (sec): 213.47 - samples/sec: 1364.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 01:52:19,150 epoch 2 - iter 2340/2606 - loss 0.16137184 - time (sec): 239.62 - samples/sec: 1364.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 01:52:46,963 epoch 2 - iter 2600/2606 - loss 0.15760333 - time (sec): 267.43 - samples/sec: 1368.84 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 01:52:47,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:52:47,781 EPOCH 2 done: loss 0.1574 - lr: 0.000027 |
|
2023-10-18 01:52:59,427 DEV : loss 0.11865425109863281 - f1-score (micro avg) 0.3154 |
|
2023-10-18 01:52:59,488 saving best model |
|
2023-10-18 01:53:00,955 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:53:27,979 epoch 3 - iter 260/2606 - loss 0.11415869 - time (sec): 27.02 - samples/sec: 1380.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 01:53:55,798 epoch 3 - iter 520/2606 - loss 0.10995719 - time (sec): 54.84 - samples/sec: 1391.02 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 01:54:22,508 epoch 3 - iter 780/2606 - loss 0.10931204 - time (sec): 81.55 - samples/sec: 1393.25 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 01:54:49,765 epoch 3 - iter 1040/2606 - loss 0.10761017 - time (sec): 108.81 - samples/sec: 1394.64 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 01:55:16,047 epoch 3 - iter 1300/2606 - loss 0.10662181 - time (sec): 135.09 - samples/sec: 1384.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 01:55:43,603 epoch 3 - iter 1560/2606 - loss 0.10877184 - time (sec): 162.64 - samples/sec: 1378.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 01:56:10,151 epoch 3 - iter 1820/2606 - loss 0.10600747 - time (sec): 189.19 - samples/sec: 1373.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 01:56:38,052 epoch 3 - iter 2080/2606 - loss 0.10590194 - time (sec): 217.09 - samples/sec: 1364.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 01:57:04,290 epoch 3 - iter 2340/2606 - loss 0.10934178 - time (sec): 243.33 - samples/sec: 1362.27 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 01:57:31,738 epoch 3 - iter 2600/2606 - loss 0.11013152 - time (sec): 270.78 - samples/sec: 1353.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 01:57:32,443 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:57:32,444 EPOCH 3 done: loss 0.1102 - lr: 0.000023 |
|
2023-10-18 01:57:43,934 DEV : loss 0.1810266077518463 - f1-score (micro avg) 0.3569 |
|
2023-10-18 01:57:43,985 saving best model |
|
2023-10-18 01:57:45,351 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 01:58:12,270 epoch 4 - iter 260/2606 - loss 0.06286773 - time (sec): 26.91 - samples/sec: 1409.03 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 01:58:38,471 epoch 4 - iter 520/2606 - loss 0.07418588 - time (sec): 53.12 - samples/sec: 1402.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 01:59:05,109 epoch 4 - iter 780/2606 - loss 0.07626348 - time (sec): 79.75 - samples/sec: 1387.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 01:59:31,143 epoch 4 - iter 1040/2606 - loss 0.07797779 - time (sec): 105.79 - samples/sec: 1363.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 01:59:57,737 epoch 4 - iter 1300/2606 - loss 0.07611310 - time (sec): 132.38 - samples/sec: 1360.63 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 02:00:25,509 epoch 4 - iter 1560/2606 - loss 0.07528087 - time (sec): 160.15 - samples/sec: 1369.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 02:00:51,422 epoch 4 - iter 1820/2606 - loss 0.07718542 - time (sec): 186.07 - samples/sec: 1363.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 02:01:19,814 epoch 4 - iter 2080/2606 - loss 0.07906334 - time (sec): 214.46 - samples/sec: 1369.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 02:01:46,681 epoch 4 - iter 2340/2606 - loss 0.07836038 - time (sec): 241.33 - samples/sec: 1362.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 02:02:14,630 epoch 4 - iter 2600/2606 - loss 0.08153752 - time (sec): 269.27 - samples/sec: 1361.88 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 02:02:15,210 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:02:15,211 EPOCH 4 done: loss 0.0816 - lr: 0.000020 |
|
2023-10-18 02:02:26,482 DEV : loss 0.23630264401435852 - f1-score (micro avg) 0.3444 |
|
2023-10-18 02:02:26,531 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:02:54,043 epoch 5 - iter 260/2606 - loss 0.04197353 - time (sec): 27.51 - samples/sec: 1369.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 02:03:22,392 epoch 5 - iter 520/2606 - loss 0.05052783 - time (sec): 55.86 - samples/sec: 1344.58 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 02:03:51,375 epoch 5 - iter 780/2606 - loss 0.05148158 - time (sec): 84.84 - samples/sec: 1357.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 02:04:18,638 epoch 5 - iter 1040/2606 - loss 0.05272903 - time (sec): 112.10 - samples/sec: 1367.58 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 02:04:44,954 epoch 5 - iter 1300/2606 - loss 0.05388151 - time (sec): 138.42 - samples/sec: 1372.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 02:05:11,302 epoch 5 - iter 1560/2606 - loss 0.05517232 - time (sec): 164.77 - samples/sec: 1355.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 02:05:36,871 epoch 5 - iter 1820/2606 - loss 0.05500689 - time (sec): 190.34 - samples/sec: 1347.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 02:06:03,512 epoch 5 - iter 2080/2606 - loss 0.05753746 - time (sec): 216.98 - samples/sec: 1355.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 02:06:29,577 epoch 5 - iter 2340/2606 - loss 0.05760071 - time (sec): 243.04 - samples/sec: 1360.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 02:06:56,947 epoch 5 - iter 2600/2606 - loss 0.05694245 - time (sec): 270.41 - samples/sec: 1356.56 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 02:06:57,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:06:57,501 EPOCH 5 done: loss 0.0569 - lr: 0.000017 |
|
2023-10-18 02:07:08,168 DEV : loss 0.3867754638195038 - f1-score (micro avg) 0.3416 |
|
2023-10-18 02:07:08,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:07:36,593 epoch 6 - iter 260/2606 - loss 0.03662114 - time (sec): 28.36 - samples/sec: 1194.39 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 02:08:02,372 epoch 6 - iter 520/2606 - loss 0.04435054 - time (sec): 54.14 - samples/sec: 1270.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 02:08:29,213 epoch 6 - iter 780/2606 - loss 0.04244809 - time (sec): 80.98 - samples/sec: 1305.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 02:08:56,728 epoch 6 - iter 1040/2606 - loss 0.04067286 - time (sec): 108.50 - samples/sec: 1345.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 02:09:22,409 epoch 6 - iter 1300/2606 - loss 0.04266422 - time (sec): 134.18 - samples/sec: 1355.88 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 02:09:48,520 epoch 6 - iter 1560/2606 - loss 0.04213331 - time (sec): 160.29 - samples/sec: 1351.85 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 02:10:15,112 epoch 6 - iter 1820/2606 - loss 0.04413458 - time (sec): 186.88 - samples/sec: 1357.74 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 02:10:42,144 epoch 6 - iter 2080/2606 - loss 0.04322785 - time (sec): 213.92 - samples/sec: 1361.22 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 02:11:09,190 epoch 6 - iter 2340/2606 - loss 0.04282381 - time (sec): 240.96 - samples/sec: 1370.22 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 02:11:36,211 epoch 6 - iter 2600/2606 - loss 0.04215471 - time (sec): 267.98 - samples/sec: 1367.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 02:11:36,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:11:36,801 EPOCH 6 done: loss 0.0422 - lr: 0.000013 |
|
2023-10-18 02:11:47,848 DEV : loss 0.5211431384086609 - f1-score (micro avg) 0.344 |
|
2023-10-18 02:11:47,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:12:14,952 epoch 7 - iter 260/2606 - loss 0.03083105 - time (sec): 27.05 - samples/sec: 1292.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 02:12:43,024 epoch 7 - iter 520/2606 - loss 0.02863032 - time (sec): 55.12 - samples/sec: 1335.23 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 02:13:09,861 epoch 7 - iter 780/2606 - loss 0.02678146 - time (sec): 81.95 - samples/sec: 1334.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 02:13:38,977 epoch 7 - iter 1040/2606 - loss 0.02706620 - time (sec): 111.07 - samples/sec: 1329.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 02:14:07,551 epoch 7 - iter 1300/2606 - loss 0.02681222 - time (sec): 139.64 - samples/sec: 1328.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 02:14:34,037 epoch 7 - iter 1560/2606 - loss 0.02757018 - time (sec): 166.13 - samples/sec: 1335.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 02:15:01,107 epoch 7 - iter 1820/2606 - loss 0.02788944 - time (sec): 193.20 - samples/sec: 1339.27 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 02:15:27,430 epoch 7 - iter 2080/2606 - loss 0.02802776 - time (sec): 219.52 - samples/sec: 1330.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 02:15:54,014 epoch 7 - iter 2340/2606 - loss 0.02810319 - time (sec): 246.11 - samples/sec: 1338.49 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 02:16:20,782 epoch 7 - iter 2600/2606 - loss 0.02820370 - time (sec): 272.88 - samples/sec: 1342.64 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 02:16:21,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:16:21,377 EPOCH 7 done: loss 0.0281 - lr: 0.000010 |
|
2023-10-18 02:16:32,117 DEV : loss 0.4535387456417084 - f1-score (micro avg) 0.3722 |
|
2023-10-18 02:16:32,179 saving best model |
|
2023-10-18 02:16:33,534 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:16:59,035 epoch 8 - iter 260/2606 - loss 0.01351293 - time (sec): 25.50 - samples/sec: 1354.64 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 02:17:25,338 epoch 8 - iter 520/2606 - loss 0.01440588 - time (sec): 51.80 - samples/sec: 1367.77 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 02:17:51,800 epoch 8 - iter 780/2606 - loss 0.01923460 - time (sec): 78.26 - samples/sec: 1356.63 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 02:18:19,236 epoch 8 - iter 1040/2606 - loss 0.01963510 - time (sec): 105.70 - samples/sec: 1357.69 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 02:18:45,434 epoch 8 - iter 1300/2606 - loss 0.02140645 - time (sec): 131.89 - samples/sec: 1361.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 02:19:12,901 epoch 8 - iter 1560/2606 - loss 0.02057017 - time (sec): 159.36 - samples/sec: 1361.37 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 02:19:40,242 epoch 8 - iter 1820/2606 - loss 0.02054834 - time (sec): 186.70 - samples/sec: 1354.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 02:20:09,396 epoch 8 - iter 2080/2606 - loss 0.02124126 - time (sec): 215.86 - samples/sec: 1362.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 02:20:36,131 epoch 8 - iter 2340/2606 - loss 0.02104593 - time (sec): 242.59 - samples/sec: 1363.92 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 02:21:02,359 epoch 8 - iter 2600/2606 - loss 0.02107554 - time (sec): 268.82 - samples/sec: 1364.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 02:21:02,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:21:02,888 EPOCH 8 done: loss 0.0211 - lr: 0.000007 |
|
2023-10-18 02:21:13,586 DEV : loss 0.41924044489860535 - f1-score (micro avg) 0.3891 |
|
2023-10-18 02:21:13,647 saving best model |
|
2023-10-18 02:21:15,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:21:42,056 epoch 9 - iter 260/2606 - loss 0.01266909 - time (sec): 27.04 - samples/sec: 1378.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 02:22:08,542 epoch 9 - iter 520/2606 - loss 0.01470607 - time (sec): 53.53 - samples/sec: 1377.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 02:22:36,336 epoch 9 - iter 780/2606 - loss 0.01543376 - time (sec): 81.32 - samples/sec: 1362.59 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 02:23:06,222 epoch 9 - iter 1040/2606 - loss 0.01493246 - time (sec): 111.21 - samples/sec: 1348.27 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 02:23:32,899 epoch 9 - iter 1300/2606 - loss 0.01541657 - time (sec): 137.89 - samples/sec: 1353.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 02:23:59,861 epoch 9 - iter 1560/2606 - loss 0.01531374 - time (sec): 164.85 - samples/sec: 1334.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 02:24:28,273 epoch 9 - iter 1820/2606 - loss 0.01476571 - time (sec): 193.26 - samples/sec: 1335.13 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 02:24:56,649 epoch 9 - iter 2080/2606 - loss 0.01475419 - time (sec): 221.64 - samples/sec: 1336.90 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 02:25:23,490 epoch 9 - iter 2340/2606 - loss 0.01470457 - time (sec): 248.48 - samples/sec: 1327.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 02:25:51,726 epoch 9 - iter 2600/2606 - loss 0.01431937 - time (sec): 276.71 - samples/sec: 1325.29 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 02:25:52,244 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:25:52,245 EPOCH 9 done: loss 0.0143 - lr: 0.000003 |
|
2023-10-18 02:26:03,159 DEV : loss 0.49826958775520325 - f1-score (micro avg) 0.3768 |
|
2023-10-18 02:26:03,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:26:30,129 epoch 10 - iter 260/2606 - loss 0.01142377 - time (sec): 26.90 - samples/sec: 1394.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 02:26:56,566 epoch 10 - iter 520/2606 - loss 0.01110790 - time (sec): 53.33 - samples/sec: 1405.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 02:27:25,269 epoch 10 - iter 780/2606 - loss 0.01085325 - time (sec): 82.04 - samples/sec: 1384.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 02:27:52,684 epoch 10 - iter 1040/2606 - loss 0.01147904 - time (sec): 109.45 - samples/sec: 1371.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 02:28:20,802 epoch 10 - iter 1300/2606 - loss 0.01133036 - time (sec): 137.57 - samples/sec: 1363.61 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 02:28:47,046 epoch 10 - iter 1560/2606 - loss 0.01108696 - time (sec): 163.81 - samples/sec: 1353.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 02:29:13,623 epoch 10 - iter 1820/2606 - loss 0.01070499 - time (sec): 190.39 - samples/sec: 1351.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 02:29:40,144 epoch 10 - iter 2080/2606 - loss 0.01026599 - time (sec): 216.91 - samples/sec: 1350.89 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 02:30:08,976 epoch 10 - iter 2340/2606 - loss 0.01060714 - time (sec): 245.74 - samples/sec: 1346.13 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 02:30:35,528 epoch 10 - iter 2600/2606 - loss 0.01038499 - time (sec): 272.29 - samples/sec: 1346.48 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 02:30:36,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:30:36,139 EPOCH 10 done: loss 0.0104 - lr: 0.000000 |
|
2023-10-18 02:30:47,830 DEV : loss 0.5188506245613098 - f1-score (micro avg) 0.3779 |
|
2023-10-18 02:30:48,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 02:30:48,410 Loading model from best epoch ... |
|
2023-10-18 02:30:50,730 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-18 02:31:10,198 |
|
Results: |
|
- F-score (micro) 0.4525 |
|
- F-score (macro) 0.3217 |
|
- Accuracy 0.2967 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4714 0.5099 0.4899 1214 |
|
PER 0.4279 0.4740 0.4498 808 |
|
ORG 0.3220 0.3768 0.3473 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4321 0.4749 0.4525 2390 |
|
macro avg 0.3054 0.3402 0.3217 2390 |
|
weighted avg 0.4317 0.4749 0.4522 2390 |
|
|
|
2023-10-18 02:31:10,198 ---------------------------------------------------------------------------------------------------- |
|
|