openai-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_133317

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.3799
Precision: 0.6114
Recall: 0.5426
F1 Macro: 0.5665
Accuracy: 0.6358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 64
seed: 0
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1 Macro	Accuracy
No log	0	0	7.6590	0.0260	0.25	0.0471	0.1041
0.324	0.3908	1000	0.3145	0.6383	0.5127	0.5396	0.6306
0.3066	0.7816	2000	0.3437	0.4684	0.5213	0.4920	0.6305
0.2673	1.1723	3000	0.2894	0.6710	0.5131	0.5397	0.6596
0.2641	1.5631	4000	0.2950	0.4970	0.5227	0.5058	0.6626
0.2563	1.9539	5000	0.3521	0.5996	0.5442	0.5075	0.6229
0.188	2.3447	6000	0.3011	0.6249	0.5593	0.5775	0.6560
0.1961	2.7354	7000	0.2987	0.6092	0.5853	0.5958	0.6483
0.1361	3.1262	8000	0.3377	0.5869	0.5882	0.5864	0.6246
0.1242	3.5170	9000	0.3181	0.6063	0.5402	0.5578	0.6440
0.1254	3.9078	10000	0.3182	0.6054	0.5625	0.5738	0.6490
0.0808	4.2986	11000	0.3553	0.6027	0.5538	0.5712	0.6263
0.0803	4.6893	12000	0.3728	0.5814	0.5931	0.5850	0.6146
0.0526	5.0801	13000	0.3444	0.5992	0.5378	0.5528	0.6393
0.0494	5.4709	14000	0.3456	0.6000	0.5533	0.5705	0.6382
0.0588	5.8617	15000	0.3447	0.5988	0.5267	0.5495	0.6319
0.0491	6.2524	16000	0.3647	0.5927	0.5429	0.5610	0.6238
0.044	6.6432	17000	0.3815	0.5714	0.5699	0.5705	0.6155
0.0305	7.0340	18000	0.3736	0.5925	0.5519	0.5670	0.6244
0.0319	7.4248	19000	0.3744	0.5909	0.5563	0.5700	0.6224
0.0412	7.8156	20000	0.3686	0.5880	0.5728	0.5780	0.6209
0.0314	8.2063	21000	0.3899	0.5786	0.5412	0.5547	0.6051
0.0334	8.5971	22000	0.3555	0.5961	0.5458	0.5629	0.6367
0.0293	8.9879	23000	0.3639	0.6088	0.5379	0.5618	0.6300
0.0219	9.3787	24000	0.3951	0.5767	0.5631	0.5680	0.6111
0.02	9.7694	25000	0.3751	0.5837	0.5728	0.5779	0.6286
0.0173	10.1602	26000	0.3717	0.6052	0.5244	0.5493	0.6271
0.0222	10.5510	27000	0.3716	0.5946	0.5605	0.5742	0.6237
0.0211	10.9418	28000	0.3647	0.5988	0.5431	0.5611	0.6381
0.0184	11.3326	29000	0.3838	0.6045	0.5320	0.5551	0.6206
0.018	11.7233	30000	0.3677	0.5948	0.5490	0.5657	0.6360
0.0205	12.1141	31000	0.3910	0.5917	0.5643	0.5729	0.6178
0.0146	12.5049	32000	0.3904	0.5806	0.5568	0.5671	0.6219
0.0149	12.8957	33000	0.3994	0.5890	0.5331	0.5506	0.6094
0.0181	13.2864	34000	0.3717	0.6000	0.5364	0.5582	0.6344
0.014	13.6772	35000	0.3752	0.5937	0.5604	0.5741	0.6345
0.0118	14.0680	36000	0.3852	0.5999	0.5532	0.5705	0.6266
0.0126	14.4588	37000	0.3776	0.6073	0.5260	0.5507	0.6347
0.0105	14.8496	38000	0.3751	0.6060	0.5463	0.5680	0.6342
0.0077	15.2403	39000	0.4115	0.5874	0.5554	0.5644	0.6066
0.0114	15.6311	40000	0.3693	0.6058	0.5475	0.5644	0.6459
0.0102	16.0219	41000	0.3725	0.6053	0.5555	0.5736	0.6425
0.0073	16.4127	42000	0.3947	0.6056	0.5494	0.5684	0.6248
0.0118	16.8034	43000	0.3728	0.6043	0.5517	0.5711	0.6401
0.0063	17.1942	44000	0.3771	0.6026	0.5569	0.5747	0.6378
0.0073	17.5850	45000	0.3835	0.6021	0.5467	0.5670	0.6302
0.0101	17.9758	46000	0.3832	0.5999	0.5500	0.5689	0.6306
0.006	18.3665	47000	0.3793	0.6044	0.5526	0.5722	0.6342
0.0063	18.7573	48000	0.3902	0.6090	0.5311	0.5559	0.6251
0.0062	19.1481	49000	0.3822	0.6016	0.5480	0.5681	0.6315
0.0059	19.5389	50000	0.3753	0.6121	0.5407	0.5647	0.6392
0.0098	19.9297	51000	0.3799	0.6114	0.5426	0.5665	0.6358

Framework versions

Transformers 4.49.0
Pytorch 2.6.0+cu124
Datasets 3.3.2
Tokenizers 0.21.1

whoisjones
/

openai-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_133317

openai-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_133317

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for whoisjones/openai-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_133317

Evaluation results