mms-1b-all-bigcgen-male-5hrs-52

This model is a fine-tuned version of facebook/mms-1b-all on the BIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4392
  • Wer: 0.4520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 52
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
8.9818 0.6211 100 4.9435 1.0139
4.8534 1.2422 200 4.2897 1.0
4.1334 1.8634 300 2.5552 1.0
1.0089 2.4845 400 0.6157 0.5893
0.7657 3.1056 500 0.5726 0.5458
0.6709 3.7267 600 0.5124 0.5251
0.6495 4.3478 700 0.4878 0.4989
0.633 4.9689 800 0.4747 0.4965
0.5976 5.5901 900 0.4695 0.4900
0.6031 6.2112 1000 0.4598 0.4910
0.6004 6.8323 1100 0.4564 0.4775
0.575 7.4534 1200 0.4562 0.4689
0.5721 8.0745 1300 0.4521 0.4684
0.5812 8.6957 1400 0.4502 0.4645
0.5445 9.3168 1500 0.4557 0.4650
0.5552 9.9379 1600 0.4427 0.4619
0.5403 10.5590 1700 0.4443 0.4556
0.5677 11.1801 1800 0.4502 0.4568
0.5519 11.8012 1900 0.4409 0.4542
0.5404 12.4224 2000 0.4392 0.4516
0.537 13.0435 2100 0.4394 0.4547
0.5411 13.6646 2200 0.4416 0.4604
0.5292 14.2857 2300 0.4355 0.4487
0.5381 14.9068 2400 0.4357 0.4451
0.5267 15.5280 2500 0.4341 0.4475
0.5212 16.1491 2600 0.4381 0.4484
0.5311 16.7702 2700 0.4402 0.4470
0.4983 17.3913 2800 0.4355 0.4484
0.5205 18.0124 2900 0.4356 0.4400

Framework versions

  • Transformers 4.53.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.0
Downloads last month
18
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for csikasote/mms-1b-all-bigcgen-male-5hrs-52

Finetuned
(360)
this model