mms-1b-all-bigcgen-male-5hrs-42

This model is a fine-tuned version of facebook/mms-1b-all on the BIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4452
  • Wer: 0.4722

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
8.1401 0.6211 100 4.4180 1.0094
4.1168 1.2422 200 3.8054 1.0019
3.7512 1.8634 300 3.5376 1.0002
3.3903 2.4845 400 3.1647 1.0019
3.1996 3.1056 500 2.9068 0.9995
2.9987 3.7267 600 2.9281 0.9995
2.9207 4.3478 700 2.6511 1.0
1.6925 4.9689 800 0.6456 0.6278
0.7698 5.5901 900 0.5464 0.5850
0.6804 6.2112 1000 0.5268 0.5374
0.6414 6.8323 1100 0.4935 0.5256
0.5936 7.4534 1200 0.4801 0.5061
0.6344 8.0745 1300 0.4744 0.5028
0.5952 8.6957 1400 0.4729 0.4963
0.5922 9.3168 1500 0.4647 0.4850
0.5804 9.9379 1600 0.4512 0.4809
0.5836 10.5590 1700 0.4538 0.4749
0.5617 11.1801 1800 0.4510 0.4730
0.5602 11.8012 1900 0.4526 0.4744
0.556 12.4224 2000 0.4452 0.4725
0.5725 13.0435 2100 0.4523 0.4722
0.5615 13.6646 2200 0.4438 0.4655
0.5542 14.2857 2300 0.4423 0.4684
0.5278 14.9068 2400 0.4457 0.4684
0.5314 15.5280 2500 0.4389 0.4604
0.5609 16.1491 2600 0.4421 0.4597
0.5383 16.7702 2700 0.4401 0.4520
0.5271 17.3913 2800 0.4407 0.4631
0.527 18.0124 2900 0.4401 0.4467

Framework versions

  • Transformers 4.53.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.0
Downloads last month
14
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for csikasote/mms-1b-all-bigcgen-male-5hrs-42

Finetuned
(360)
this model