mms-1b-all-swagen-female-15hrs-42

This model is a fine-tuned version of facebook/mms-1b-all on the SWAGEN - SWA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2061
  • Wer: 0.1855

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 5.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
7.959 0.1572 100 4.5467 1.0
4.1239 0.3145 200 3.8855 1.0
3.2097 0.4717 300 2.8369 0.9861
0.6263 0.6289 400 0.2378 0.1940
0.2794 0.7862 500 0.2318 0.1907
0.2517 0.9434 600 0.2285 0.1898
0.2385 1.1006 700 0.2243 0.1888
0.2469 1.2579 800 0.2187 0.1888
0.2474 1.4151 900 0.2183 0.1861
0.2379 1.5723 1000 0.2135 0.1884
0.2345 1.7296 1100 0.2091 0.1859
0.2262 1.8868 1200 0.2060 0.1855
0.2318 2.0440 1300 0.2053 0.1830
0.2244 2.2013 1400 0.2051 0.1861
0.2222 2.3585 1500 0.2031 0.1844
0.2264 2.5157 1600 0.2042 0.1853
0.2121 2.6730 1700 0.2001 0.1838
0.2234 2.8302 1800 0.2032 0.1853
0.2315 2.9874 1900 0.2033 0.1840
0.2135 3.1447 2000 0.2003 0.1838
0.2154 3.3019 2100 0.2023 0.1836

Framework versions

  • Transformers 4.53.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.0
Downloads last month
-
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for csikasote/mms-1b-all-swagen-female-15hrs-42

Finetuned
(359)
this model