bert-philosophy-classifier

This model is a fine-tuned version of maximuspowers/bert-philosophy-adapted on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5565
Exact Match Accuracy: 0.2430
Macro Precision: 0.5046
Macro Recall: 0.2169
Macro F1: 0.2688
Micro Precision: 0.8130
Micro Recall: 0.3380
Micro F1: 0.4775
Hamming Loss: 0.0709

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match Accuracy	Macro Precision	Macro Recall	Macro F1	Micro Precision	Micro Recall	Micro F1	Hamming Loss
1.9545	0.3521	100	1.0206	0.0071	0.0171	0.0027	0.0047	0.25	0.0096	0.0185	0.0992
1.4947	0.7042	200	0.9205	0.0	0.0588	0.0003	0.0006	1.0	0.0011	0.0021	0.0972
1.2688	1.0563	300	0.8579	0.0	0.0588	0.0003	0.0006	1.0	0.0011	0.0021	0.0972
1.2271	1.4085	400	0.9072	0.0071	0.0588	0.0030	0.0058	1.0	0.0107	0.0211	0.0963
1.1877	1.7606	500	0.7930	0.0353	0.0551	0.0136	0.0219	0.9375	0.0480	0.0913	0.0930
1.1545	2.1127	600	0.7768	0.0670	0.0537	0.0255	0.0346	0.9130	0.0896	0.1631	0.0894
1.1276	2.4648	700	0.7173	0.0864	0.0521	0.0303	0.0383	0.8850	0.1066	0.1903	0.0883
1.1083	2.8169	800	0.7093	0.0758	0.1126	0.0298	0.0394	0.9143	0.1023	0.1841	0.0883
1.0268	3.1690	900	0.6733	0.1041	0.1640	0.0517	0.0644	0.8057	0.1503	0.2534	0.0862
1.0161	3.5211	1000	0.6472	0.1164	0.1559	0.0634	0.0861	0.8533	0.1674	0.2799	0.0838
0.9917	3.8732	1100	0.7055	0.1358	0.2132	0.0736	0.0970	0.8465	0.1940	0.3157	0.0819
0.9533	4.2254	1200	0.6556	0.1834	0.2694	0.1242	0.1646	0.8812	0.2452	0.3837	0.0767
0.9747	4.5775	1300	0.6144	0.2011	0.2716	0.1285	0.1690	0.8773	0.2591	0.4	0.0756
0.9275	4.9296	1400	0.6027	0.2063	0.2682	0.1408	0.1804	0.8513	0.2868	0.4290	0.0743
0.8702	5.2817	1500	0.6040	0.2240	0.3197	0.1559	0.1977	0.8542	0.3060	0.4505	0.0726
0.8582	5.6338	1600	0.6104	0.2293	0.3684	0.1697	0.2177	0.8426	0.3081	0.4512	0.0729
0.8783	5.9859	1700	0.5885	0.2328	0.3749	0.1646	0.2117	0.8657	0.3092	0.4556	0.0719
0.8147	6.3380	1800	0.5681	0.2469	0.4728	0.1941	0.2427	0.8215	0.3337	0.4746	0.0719
0.8155	6.6901	1900	0.5858	0.2399	0.3577	0.1873	0.2337	0.8144	0.3369	0.4766	0.0720
0.812	7.0423	2000	0.5932	0.2434	0.5377	0.2240	0.2870	0.8285	0.3348	0.4768	0.0715
0.7735	7.3944	2100	0.5969	0.2504	0.4537	0.2217	0.2802	0.7844	0.3529	0.4868	0.0724
0.7747	7.7465	2200	0.5980	0.2734	0.5684	0.2460	0.3142	0.7941	0.3699	0.5047	0.0707
0.6935	8.0986	2300	0.5834	0.2822	0.4822	0.2493	0.3069	0.7669	0.3859	0.5135	0.0712
0.7359	8.4507	2400	0.5643	0.2875	0.5755	0.2854	0.3535	0.7991	0.3987	0.5320	0.0683
0.6547	8.8028	2500	0.5672	0.2875	0.5700	0.2989	0.3656	0.7878	0.4115	0.5406	0.0681
0.6568	9.1549	2600	0.5804	0.2857	0.5921	0.2826	0.3611	0.8174	0.3913	0.5292	0.0677
0.683	9.5070	2700	0.5911	0.2787	0.5610	0.2682	0.3399	0.7577	0.3934	0.5179	0.0713
0.6916	9.8592	2800	0.5553	0.2892	0.6354	0.3208	0.3899	0.7882	0.4126	0.5416	0.0680
0.6112	10.2113	2900	0.5829	0.3228	0.6405	0.3521	0.4351	0.7911	0.4563	0.5788	0.0646
0.6032	10.5634	3000	0.6113	0.3069	0.6247	0.3173	0.3949	0.7556	0.4350	0.5521	0.0687
0.5927	10.9155	3100	0.5666	0.3016	0.6423	0.3289	0.4154	0.8065	0.4222	0.5542	0.0661
0.5639	11.2676	3200	0.5527	0.3086	0.5956	0.3482	0.4169	0.7522	0.4563	0.5680	0.0675
0.5965	11.6197	3300	0.5370	0.3192	0.6174	0.3337	0.4061	0.7692	0.4584	0.5745	0.0661
0.5809	11.9718	3400	0.5517	0.3175	0.6677	0.3737	0.4510	0.7676	0.4542	0.5707	0.0665

Framework versions

Transformers 4.52.4
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.2

maximuspowers
/

bert-philosophy-classifier

bert-philosophy-classifier

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for maximuspowers/bert-philosophy-classifier

Dataset used to train maximuspowers/bert-philosophy-classifier

Collection including maximuspowers/bert-philosophy-classifier

PhilosophAI

Evaluation results