whisper-large-v2-ft-Jana-BTU6567_mix8_snr6_base-on-car350-250428-v2

This model is a fine-tuned version of openai/whisper-large-v2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7334

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 64
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
10.3916	1.0	1	10.5424
10.3581	2.0	2	10.5424
10.3671	3.0	3	10.5424
10.401	4.0	4	10.5424
10.3032	5.0	5	10.5424
10.3819	6.0	6	10.4732
10.3083	7.0	7	10.2188
10.148	8.0	8	9.7890
9.6059	9.0	9	9.7890
9.7222	10.0	10	9.7890
9.6653	11.0	11	9.7890
9.7595	12.0	12	9.1966
9.1529	13.0	13	8.4308
8.4287	14.0	14	7.4996
7.5551	15.0	15	6.6037
6.601	16.0	16	5.9718
5.9646	17.0	17	5.6182
5.5941	18.0	18	5.1765
5.1806	19.0	19	4.8658
4.8027	20.0	20	4.6209
4.543	21.0	21	4.6209
4.5359	22.0	22	4.5022
4.4192	23.0	23	4.4469
4.3674	24.0	24	4.3916
4.3205	25.0	25	4.3285
4.256	26.0	26	4.2549
4.1802	27.0	27	4.1693
4.0941	28.0	28	4.0703
3.9866	29.0	29	3.9497
3.8535	30.0	30	3.8104
3.7066	31.0	31	3.6442
3.5391	32.0	32	3.4423
3.3251	33.0	33	3.2242
3.0761	34.0	34	3.0329
2.9123	35.0	35	2.8894
2.7919	36.0	36	2.7920
2.7079	37.0	37	2.7183
2.6552	38.0	38	2.6559
2.5932	39.0	39	2.6002
2.5476	40.0	40	2.5471
2.4928	41.0	41	2.4972
2.4491	42.0	42	2.4524
2.4105	43.0	43	2.4124
2.3736	44.0	44	2.3768
2.3434	45.0	45	2.3443
2.3184	46.0	46	2.3126
2.286	47.0	47	2.2816
2.2498	48.0	48	2.2509
2.2296	49.0	49	2.2202
2.1987	50.0	50	2.1894
2.1668	51.0	51	2.1583
2.1356	52.0	52	2.1268
2.1045	53.0	53	2.0963
2.0735	54.0	54	2.0655
2.042	55.0	55	2.0355
2.009	56.0	56	2.0060
1.9836	57.0	57	1.9768
1.9536	58.0	58	1.9486
1.9251	59.0	59	1.9202
1.8913	60.0	60	1.8916
1.8655	61.0	61	1.8631
1.8379	62.0	62	1.8348
1.8093	63.0	63	1.8061
1.7794	64.0	64	1.7771
1.7491	65.0	65	1.7476
1.7222	66.0	66	1.7176
1.6909	67.0	67	1.6878
1.659	68.0	68	1.6585
1.6295	69.0	69	1.6287
1.6019	70.0	70	1.5988
1.5732	71.0	71	1.5692
1.5425	72.0	72	1.5388
1.5123	73.0	73	1.5086
1.4763	74.0	74	1.4782
1.4454	75.0	75	1.4472
1.4168	76.0	76	1.4163
1.3882	77.0	77	1.3848
1.3562	78.0	78	1.3539
1.328	79.0	79	1.3222
1.2881	80.0	80	1.2905
1.2618	81.0	81	1.2594
1.2282	82.0	82	1.2275
1.199	83.0	83	1.1964
1.1677	84.0	84	1.1643
1.1363	85.0	85	1.1334
1.104	86.0	86	1.1026
1.0739	87.0	87	1.0723
1.0453	88.0	88	1.0424
1.0107	89.0	89	1.0125
0.9816	90.0	90	0.9829
0.9487	91.0	91	0.9538
0.9232	92.0	92	0.9252
0.8976	93.0	93	0.8977
0.8621	94.0	94	0.8708
0.8342	95.0	95	0.8447
0.8169	96.0	96	0.8200
0.7841	97.0	97	0.7964
0.7613	98.0	98	0.7739
0.7386	99.0	99	0.7530
0.7085	100.0	100	0.7334

Framework versions

PEFT 0.13.0
Transformers 4.45.1
Pytorch 2.5.0+cu124
Datasets 2.21.0
Tokenizers 0.20.0

dylanewbie
/

whisper-large-v2-ft-Jana-BTU6567_mix8_snr6_base-on-car350-250428-v2

whisper-large-v2-ft-Jana-BTU6567_mix8_snr6_base-on-car350-250428-v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for dylanewbie/whisper-large-v2-ft-Jana-BTU6567_mix8_snr6_base-on-car350-250428-v2

Evaluation results