bert-philosophy-classifier

This model is a fine-tuned version of maximuspowers/bert-philosophy-adapted on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5565
  • Exact Match Accuracy: 0.2430
  • Macro Precision: 0.5046
  • Macro Recall: 0.2169
  • Macro F1: 0.2688
  • Micro Precision: 0.8130
  • Micro Recall: 0.3380
  • Micro F1: 0.4775
  • Hamming Loss: 0.0709

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact Match Accuracy Macro Precision Macro Recall Macro F1 Micro Precision Micro Recall Micro F1 Hamming Loss
1.9545 0.3521 100 1.0206 0.0071 0.0171 0.0027 0.0047 0.25 0.0096 0.0185 0.0992
1.4947 0.7042 200 0.9205 0.0 0.0588 0.0003 0.0006 1.0 0.0011 0.0021 0.0972
1.2688 1.0563 300 0.8579 0.0 0.0588 0.0003 0.0006 1.0 0.0011 0.0021 0.0972
1.2271 1.4085 400 0.9072 0.0071 0.0588 0.0030 0.0058 1.0 0.0107 0.0211 0.0963
1.1877 1.7606 500 0.7930 0.0353 0.0551 0.0136 0.0219 0.9375 0.0480 0.0913 0.0930
1.1545 2.1127 600 0.7768 0.0670 0.0537 0.0255 0.0346 0.9130 0.0896 0.1631 0.0894
1.1276 2.4648 700 0.7173 0.0864 0.0521 0.0303 0.0383 0.8850 0.1066 0.1903 0.0883
1.1083 2.8169 800 0.7093 0.0758 0.1126 0.0298 0.0394 0.9143 0.1023 0.1841 0.0883
1.0268 3.1690 900 0.6733 0.1041 0.1640 0.0517 0.0644 0.8057 0.1503 0.2534 0.0862
1.0161 3.5211 1000 0.6472 0.1164 0.1559 0.0634 0.0861 0.8533 0.1674 0.2799 0.0838
0.9917 3.8732 1100 0.7055 0.1358 0.2132 0.0736 0.0970 0.8465 0.1940 0.3157 0.0819
0.9533 4.2254 1200 0.6556 0.1834 0.2694 0.1242 0.1646 0.8812 0.2452 0.3837 0.0767
0.9747 4.5775 1300 0.6144 0.2011 0.2716 0.1285 0.1690 0.8773 0.2591 0.4 0.0756
0.9275 4.9296 1400 0.6027 0.2063 0.2682 0.1408 0.1804 0.8513 0.2868 0.4290 0.0743
0.8702 5.2817 1500 0.6040 0.2240 0.3197 0.1559 0.1977 0.8542 0.3060 0.4505 0.0726
0.8582 5.6338 1600 0.6104 0.2293 0.3684 0.1697 0.2177 0.8426 0.3081 0.4512 0.0729
0.8783 5.9859 1700 0.5885 0.2328 0.3749 0.1646 0.2117 0.8657 0.3092 0.4556 0.0719
0.8147 6.3380 1800 0.5681 0.2469 0.4728 0.1941 0.2427 0.8215 0.3337 0.4746 0.0719
0.8155 6.6901 1900 0.5858 0.2399 0.3577 0.1873 0.2337 0.8144 0.3369 0.4766 0.0720
0.812 7.0423 2000 0.5932 0.2434 0.5377 0.2240 0.2870 0.8285 0.3348 0.4768 0.0715
0.7735 7.3944 2100 0.5969 0.2504 0.4537 0.2217 0.2802 0.7844 0.3529 0.4868 0.0724
0.7747 7.7465 2200 0.5980 0.2734 0.5684 0.2460 0.3142 0.7941 0.3699 0.5047 0.0707
0.6935 8.0986 2300 0.5834 0.2822 0.4822 0.2493 0.3069 0.7669 0.3859 0.5135 0.0712
0.7359 8.4507 2400 0.5643 0.2875 0.5755 0.2854 0.3535 0.7991 0.3987 0.5320 0.0683
0.6547 8.8028 2500 0.5672 0.2875 0.5700 0.2989 0.3656 0.7878 0.4115 0.5406 0.0681
0.6568 9.1549 2600 0.5804 0.2857 0.5921 0.2826 0.3611 0.8174 0.3913 0.5292 0.0677
0.683 9.5070 2700 0.5911 0.2787 0.5610 0.2682 0.3399 0.7577 0.3934 0.5179 0.0713
0.6916 9.8592 2800 0.5553 0.2892 0.6354 0.3208 0.3899 0.7882 0.4126 0.5416 0.0680
0.6112 10.2113 2900 0.5829 0.3228 0.6405 0.3521 0.4351 0.7911 0.4563 0.5788 0.0646
0.6032 10.5634 3000 0.6113 0.3069 0.6247 0.3173 0.3949 0.7556 0.4350 0.5521 0.0687
0.5927 10.9155 3100 0.5666 0.3016 0.6423 0.3289 0.4154 0.8065 0.4222 0.5542 0.0661
0.5639 11.2676 3200 0.5527 0.3086 0.5956 0.3482 0.4169 0.7522 0.4563 0.5680 0.0675
0.5965 11.6197 3300 0.5370 0.3192 0.6174 0.3337 0.4061 0.7692 0.4584 0.5745 0.0661
0.5809 11.9718 3400 0.5517 0.3175 0.6677 0.3737 0.4510 0.7676 0.4542 0.5707 0.0665

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.2
Downloads last month
140
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maximuspowers/bert-philosophy-classifier

Finetuned
(1)
this model

Dataset used to train maximuspowers/bert-philosophy-classifier

Collection including maximuspowers/bert-philosophy-classifier