stefan-it's picture
Upload folder using huggingface_hub
152837a
2023-10-17 15:45:56,057 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,058 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 15:45:56,058 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,058 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 15:45:56,058 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,058 Train: 5777 sentences
2023-10-17 15:45:56,058 (train_with_dev=False, train_with_test=False)
2023-10-17 15:45:56,058 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,058 Training Params:
2023-10-17 15:45:56,058 - learning_rate: "3e-05"
2023-10-17 15:45:56,058 - mini_batch_size: "8"
2023-10-17 15:45:56,058 - max_epochs: "10"
2023-10-17 15:45:56,059 - shuffle: "True"
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 Plugins:
2023-10-17 15:45:56,059 - TensorboardLogger
2023-10-17 15:45:56,059 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 15:45:56,059 - metric: "('micro avg', 'f1-score')"
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 Computation:
2023-10-17 15:45:56,059 - compute on device: cuda:0
2023-10-17 15:45:56,059 - embedding storage: none
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 ----------------------------------------------------------------------------------------------------
2023-10-17 15:45:56,059 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 15:46:01,673 epoch 1 - iter 72/723 - loss 2.68178164 - time (sec): 5.61 - samples/sec: 3306.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:46:06,739 epoch 1 - iter 144/723 - loss 1.73320094 - time (sec): 10.68 - samples/sec: 3225.17 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:46:11,826 epoch 1 - iter 216/723 - loss 1.21849321 - time (sec): 15.77 - samples/sec: 3304.87 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:46:16,933 epoch 1 - iter 288/723 - loss 0.96972971 - time (sec): 20.87 - samples/sec: 3271.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:46:22,048 epoch 1 - iter 360/723 - loss 0.80187891 - time (sec): 25.99 - samples/sec: 3328.59 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:46:27,686 epoch 1 - iter 432/723 - loss 0.68438615 - time (sec): 31.63 - samples/sec: 3336.27 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:46:32,804 epoch 1 - iter 504/723 - loss 0.60522849 - time (sec): 36.74 - samples/sec: 3351.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:46:38,353 epoch 1 - iter 576/723 - loss 0.54423736 - time (sec): 42.29 - samples/sec: 3336.85 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:46:43,466 epoch 1 - iter 648/723 - loss 0.49945669 - time (sec): 47.41 - samples/sec: 3343.86 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:46:48,680 epoch 1 - iter 720/723 - loss 0.46320595 - time (sec): 52.62 - samples/sec: 3338.49 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:46:48,865 ----------------------------------------------------------------------------------------------------
2023-10-17 15:46:48,866 EPOCH 1 done: loss 0.4621 - lr: 0.000030
2023-10-17 15:46:51,668 DEV : loss 0.083634153008461 - f1-score (micro avg) 0.7678
2023-10-17 15:46:51,699 saving best model
2023-10-17 15:46:52,038 ----------------------------------------------------------------------------------------------------
2023-10-17 15:46:56,788 epoch 2 - iter 72/723 - loss 0.10554327 - time (sec): 4.75 - samples/sec: 3504.40 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:47:01,877 epoch 2 - iter 144/723 - loss 0.10289536 - time (sec): 9.84 - samples/sec: 3449.63 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:47:07,229 epoch 2 - iter 216/723 - loss 0.09591655 - time (sec): 15.19 - samples/sec: 3379.09 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:47:12,264 epoch 2 - iter 288/723 - loss 0.09172749 - time (sec): 20.22 - samples/sec: 3385.14 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:47:17,683 epoch 2 - iter 360/723 - loss 0.08913080 - time (sec): 25.64 - samples/sec: 3384.07 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:47:23,227 epoch 2 - iter 432/723 - loss 0.08592092 - time (sec): 31.19 - samples/sec: 3395.11 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:47:28,394 epoch 2 - iter 504/723 - loss 0.08558040 - time (sec): 36.35 - samples/sec: 3375.91 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:47:33,900 epoch 2 - iter 576/723 - loss 0.08586270 - time (sec): 41.86 - samples/sec: 3355.82 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:47:39,160 epoch 2 - iter 648/723 - loss 0.08631559 - time (sec): 47.12 - samples/sec: 3341.48 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:47:44,691 epoch 2 - iter 720/723 - loss 0.08442320 - time (sec): 52.65 - samples/sec: 3337.98 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:47:44,833 ----------------------------------------------------------------------------------------------------
2023-10-17 15:47:44,834 EPOCH 2 done: loss 0.0845 - lr: 0.000027
2023-10-17 15:47:48,121 DEV : loss 0.07469072937965393 - f1-score (micro avg) 0.8009
2023-10-17 15:47:48,138 saving best model
2023-10-17 15:47:48,600 ----------------------------------------------------------------------------------------------------
2023-10-17 15:47:53,799 epoch 3 - iter 72/723 - loss 0.06061902 - time (sec): 5.20 - samples/sec: 3343.37 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:47:58,758 epoch 3 - iter 144/723 - loss 0.06246632 - time (sec): 10.16 - samples/sec: 3410.06 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:48:04,289 epoch 3 - iter 216/723 - loss 0.05836782 - time (sec): 15.69 - samples/sec: 3416.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:48:09,294 epoch 3 - iter 288/723 - loss 0.06024793 - time (sec): 20.69 - samples/sec: 3418.32 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:48:14,102 epoch 3 - iter 360/723 - loss 0.05975732 - time (sec): 25.50 - samples/sec: 3431.01 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:48:19,583 epoch 3 - iter 432/723 - loss 0.05914523 - time (sec): 30.98 - samples/sec: 3399.25 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:48:24,777 epoch 3 - iter 504/723 - loss 0.05796925 - time (sec): 36.17 - samples/sec: 3373.49 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:48:29,852 epoch 3 - iter 576/723 - loss 0.05802209 - time (sec): 41.25 - samples/sec: 3381.28 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:48:35,266 epoch 3 - iter 648/723 - loss 0.05926008 - time (sec): 46.66 - samples/sec: 3378.08 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:48:40,649 epoch 3 - iter 720/723 - loss 0.05870783 - time (sec): 52.05 - samples/sec: 3377.67 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:48:40,818 ----------------------------------------------------------------------------------------------------
2023-10-17 15:48:40,818 EPOCH 3 done: loss 0.0589 - lr: 0.000023
2023-10-17 15:48:44,093 DEV : loss 0.06352876126766205 - f1-score (micro avg) 0.8631
2023-10-17 15:48:44,110 saving best model
2023-10-17 15:48:44,564 ----------------------------------------------------------------------------------------------------
2023-10-17 15:48:49,941 epoch 4 - iter 72/723 - loss 0.03603168 - time (sec): 5.37 - samples/sec: 3396.04 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:48:55,217 epoch 4 - iter 144/723 - loss 0.04465175 - time (sec): 10.65 - samples/sec: 3338.97 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:49:00,169 epoch 4 - iter 216/723 - loss 0.03818496 - time (sec): 15.60 - samples/sec: 3391.80 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:49:05,250 epoch 4 - iter 288/723 - loss 0.04002032 - time (sec): 20.68 - samples/sec: 3399.27 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:49:10,037 epoch 4 - iter 360/723 - loss 0.04084612 - time (sec): 25.47 - samples/sec: 3412.22 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:49:15,257 epoch 4 - iter 432/723 - loss 0.04029838 - time (sec): 30.69 - samples/sec: 3405.78 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:49:20,634 epoch 4 - iter 504/723 - loss 0.03969386 - time (sec): 36.07 - samples/sec: 3391.39 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:49:25,857 epoch 4 - iter 576/723 - loss 0.03975612 - time (sec): 41.29 - samples/sec: 3388.35 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:49:31,175 epoch 4 - iter 648/723 - loss 0.03963063 - time (sec): 46.61 - samples/sec: 3381.79 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:49:36,726 epoch 4 - iter 720/723 - loss 0.04156451 - time (sec): 52.16 - samples/sec: 3370.35 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:49:36,898 ----------------------------------------------------------------------------------------------------
2023-10-17 15:49:36,899 EPOCH 4 done: loss 0.0415 - lr: 0.000020
2023-10-17 15:49:41,118 DEV : loss 0.06799901276826859 - f1-score (micro avg) 0.8746
2023-10-17 15:49:41,141 saving best model
2023-10-17 15:49:41,679 ----------------------------------------------------------------------------------------------------
2023-10-17 15:49:47,167 epoch 5 - iter 72/723 - loss 0.02376358 - time (sec): 5.48 - samples/sec: 3226.12 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:49:52,615 epoch 5 - iter 144/723 - loss 0.02594860 - time (sec): 10.93 - samples/sec: 3287.37 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:49:57,962 epoch 5 - iter 216/723 - loss 0.02802592 - time (sec): 16.28 - samples/sec: 3301.19 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:50:02,837 epoch 5 - iter 288/723 - loss 0.02819628 - time (sec): 21.15 - samples/sec: 3331.14 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:50:08,174 epoch 5 - iter 360/723 - loss 0.02715481 - time (sec): 26.49 - samples/sec: 3314.58 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:50:13,371 epoch 5 - iter 432/723 - loss 0.02990002 - time (sec): 31.69 - samples/sec: 3310.19 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:50:18,582 epoch 5 - iter 504/723 - loss 0.03021850 - time (sec): 36.90 - samples/sec: 3307.86 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:50:24,122 epoch 5 - iter 576/723 - loss 0.03040793 - time (sec): 42.44 - samples/sec: 3313.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:50:29,149 epoch 5 - iter 648/723 - loss 0.03142248 - time (sec): 47.47 - samples/sec: 3325.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:50:34,566 epoch 5 - iter 720/723 - loss 0.03099853 - time (sec): 52.88 - samples/sec: 3321.54 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:50:34,755 ----------------------------------------------------------------------------------------------------
2023-10-17 15:50:34,755 EPOCH 5 done: loss 0.0309 - lr: 0.000017
2023-10-17 15:50:38,119 DEV : loss 0.07522039115428925 - f1-score (micro avg) 0.8791
2023-10-17 15:50:38,139 saving best model
2023-10-17 15:50:38,758 ----------------------------------------------------------------------------------------------------
2023-10-17 15:50:44,341 epoch 6 - iter 72/723 - loss 0.02298660 - time (sec): 5.58 - samples/sec: 3136.20 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:50:49,596 epoch 6 - iter 144/723 - loss 0.02602275 - time (sec): 10.84 - samples/sec: 3140.88 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:50:54,847 epoch 6 - iter 216/723 - loss 0.02355676 - time (sec): 16.09 - samples/sec: 3207.25 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:51:00,517 epoch 6 - iter 288/723 - loss 0.02151277 - time (sec): 21.76 - samples/sec: 3207.74 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:51:06,018 epoch 6 - iter 360/723 - loss 0.02188892 - time (sec): 27.26 - samples/sec: 3228.23 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:51:10,863 epoch 6 - iter 432/723 - loss 0.02227604 - time (sec): 32.10 - samples/sec: 3252.96 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:51:16,271 epoch 6 - iter 504/723 - loss 0.02192958 - time (sec): 37.51 - samples/sec: 3267.25 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:51:21,264 epoch 6 - iter 576/723 - loss 0.02222078 - time (sec): 42.50 - samples/sec: 3271.97 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:51:26,468 epoch 6 - iter 648/723 - loss 0.02271175 - time (sec): 47.71 - samples/sec: 3285.50 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:51:31,981 epoch 6 - iter 720/723 - loss 0.02243575 - time (sec): 53.22 - samples/sec: 3297.48 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:51:32,161 ----------------------------------------------------------------------------------------------------
2023-10-17 15:51:32,161 EPOCH 6 done: loss 0.0224 - lr: 0.000013
2023-10-17 15:51:35,411 DEV : loss 0.0899529755115509 - f1-score (micro avg) 0.8669
2023-10-17 15:51:35,428 ----------------------------------------------------------------------------------------------------
2023-10-17 15:51:40,615 epoch 7 - iter 72/723 - loss 0.02639271 - time (sec): 5.19 - samples/sec: 3330.13 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:51:45,594 epoch 7 - iter 144/723 - loss 0.02336250 - time (sec): 10.16 - samples/sec: 3347.60 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:51:51,011 epoch 7 - iter 216/723 - loss 0.02344656 - time (sec): 15.58 - samples/sec: 3334.04 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:51:56,570 epoch 7 - iter 288/723 - loss 0.02259809 - time (sec): 21.14 - samples/sec: 3312.26 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:52:01,822 epoch 7 - iter 360/723 - loss 0.02097885 - time (sec): 26.39 - samples/sec: 3308.99 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:52:07,186 epoch 7 - iter 432/723 - loss 0.02029519 - time (sec): 31.76 - samples/sec: 3338.21 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:52:12,489 epoch 7 - iter 504/723 - loss 0.01871245 - time (sec): 37.06 - samples/sec: 3322.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:52:17,453 epoch 7 - iter 576/723 - loss 0.01758688 - time (sec): 42.02 - samples/sec: 3336.67 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:52:22,949 epoch 7 - iter 648/723 - loss 0.01786632 - time (sec): 47.52 - samples/sec: 3326.85 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:52:28,120 epoch 7 - iter 720/723 - loss 0.01774989 - time (sec): 52.69 - samples/sec: 3336.16 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:52:28,292 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:28,292 EPOCH 7 done: loss 0.0177 - lr: 0.000010
2023-10-17 15:52:31,922 DEV : loss 0.11688686162233353 - f1-score (micro avg) 0.8596
2023-10-17 15:52:31,938 ----------------------------------------------------------------------------------------------------
2023-10-17 15:52:37,064 epoch 8 - iter 72/723 - loss 0.00919413 - time (sec): 5.12 - samples/sec: 3151.46 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:52:42,635 epoch 8 - iter 144/723 - loss 0.01083441 - time (sec): 10.70 - samples/sec: 3230.53 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:52:47,538 epoch 8 - iter 216/723 - loss 0.00989999 - time (sec): 15.60 - samples/sec: 3353.03 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:52:52,785 epoch 8 - iter 288/723 - loss 0.01247336 - time (sec): 20.85 - samples/sec: 3311.76 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:52:58,034 epoch 8 - iter 360/723 - loss 0.01235182 - time (sec): 26.09 - samples/sec: 3299.45 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:53:03,779 epoch 8 - iter 432/723 - loss 0.01184060 - time (sec): 31.84 - samples/sec: 3284.45 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:53:08,859 epoch 8 - iter 504/723 - loss 0.01254325 - time (sec): 36.92 - samples/sec: 3315.05 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:53:13,896 epoch 8 - iter 576/723 - loss 0.01301608 - time (sec): 41.96 - samples/sec: 3319.64 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:53:19,337 epoch 8 - iter 648/723 - loss 0.01254319 - time (sec): 47.40 - samples/sec: 3330.23 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:53:24,575 epoch 8 - iter 720/723 - loss 0.01246900 - time (sec): 52.64 - samples/sec: 3333.78 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:53:24,822 ----------------------------------------------------------------------------------------------------
2023-10-17 15:53:24,823 EPOCH 8 done: loss 0.0124 - lr: 0.000007
2023-10-17 15:53:28,074 DEV : loss 0.13734176754951477 - f1-score (micro avg) 0.8504
2023-10-17 15:53:28,090 ----------------------------------------------------------------------------------------------------
2023-10-17 15:53:33,549 epoch 9 - iter 72/723 - loss 0.00909466 - time (sec): 5.46 - samples/sec: 3518.82 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:53:38,524 epoch 9 - iter 144/723 - loss 0.00940547 - time (sec): 10.43 - samples/sec: 3364.03 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:53:44,259 epoch 9 - iter 216/723 - loss 0.01048964 - time (sec): 16.17 - samples/sec: 3349.94 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:53:49,765 epoch 9 - iter 288/723 - loss 0.00977033 - time (sec): 21.67 - samples/sec: 3342.19 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:53:54,956 epoch 9 - iter 360/723 - loss 0.01032154 - time (sec): 26.86 - samples/sec: 3334.04 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:53:59,778 epoch 9 - iter 432/723 - loss 0.00954858 - time (sec): 31.69 - samples/sec: 3331.01 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:54:05,113 epoch 9 - iter 504/723 - loss 0.00934623 - time (sec): 37.02 - samples/sec: 3327.57 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:54:10,656 epoch 9 - iter 576/723 - loss 0.00919239 - time (sec): 42.56 - samples/sec: 3320.20 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:54:15,842 epoch 9 - iter 648/723 - loss 0.00916243 - time (sec): 47.75 - samples/sec: 3328.89 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:54:20,519 epoch 9 - iter 720/723 - loss 0.00958573 - time (sec): 52.43 - samples/sec: 3346.41 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:54:20,785 ----------------------------------------------------------------------------------------------------
2023-10-17 15:54:20,785 EPOCH 9 done: loss 0.0095 - lr: 0.000003
2023-10-17 15:54:24,408 DEV : loss 0.1309017837047577 - f1-score (micro avg) 0.868
2023-10-17 15:54:24,424 ----------------------------------------------------------------------------------------------------
2023-10-17 15:54:29,477 epoch 10 - iter 72/723 - loss 0.01046618 - time (sec): 5.05 - samples/sec: 3453.62 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:54:34,691 epoch 10 - iter 144/723 - loss 0.00825176 - time (sec): 10.27 - samples/sec: 3414.55 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:54:39,628 epoch 10 - iter 216/723 - loss 0.00756983 - time (sec): 15.20 - samples/sec: 3400.23 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:54:44,560 epoch 10 - iter 288/723 - loss 0.00728100 - time (sec): 20.14 - samples/sec: 3374.43 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:54:50,251 epoch 10 - iter 360/723 - loss 0.00697531 - time (sec): 25.83 - samples/sec: 3356.39 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:54:55,716 epoch 10 - iter 432/723 - loss 0.00709396 - time (sec): 31.29 - samples/sec: 3368.09 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:55:00,829 epoch 10 - iter 504/723 - loss 0.00639323 - time (sec): 36.40 - samples/sec: 3352.43 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:55:06,443 epoch 10 - iter 576/723 - loss 0.00638631 - time (sec): 42.02 - samples/sec: 3334.30 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:55:11,652 epoch 10 - iter 648/723 - loss 0.00661275 - time (sec): 47.23 - samples/sec: 3347.94 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:55:16,942 epoch 10 - iter 720/723 - loss 0.00661983 - time (sec): 52.52 - samples/sec: 3346.42 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:55:17,097 ----------------------------------------------------------------------------------------------------
2023-10-17 15:55:17,098 EPOCH 10 done: loss 0.0066 - lr: 0.000000
2023-10-17 15:55:20,489 DEV : loss 0.13723282516002655 - f1-score (micro avg) 0.8639
2023-10-17 15:55:20,889 ----------------------------------------------------------------------------------------------------
2023-10-17 15:55:20,891 Loading model from best epoch ...
2023-10-17 15:55:22,430 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 15:55:25,740
Results:
- F-score (micro) 0.855
- F-score (macro) 0.7682
- Accuracy 0.7567
By class:
precision recall f1-score support
PER 0.8568 0.8320 0.8442 482
LOC 0.9278 0.8974 0.9123 458
ORG 0.5606 0.5362 0.5481 69
micro avg 0.8690 0.8414 0.8550 1009
macro avg 0.7817 0.7552 0.7682 1009
weighted avg 0.8688 0.8414 0.8549 1009
2023-10-17 15:55:25,740 ----------------------------------------------------------------------------------------------------