w2v-bert-2.0-hausa_250_250h

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2732
  • Wer: 0.3433
  • Cer: 0.1936

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 160
  • eval_batch_size: 160
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 320
  • total_eval_batch_size: 320
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
0.5401 0.6406 1000 0.2055 0.3666 0.3864
0.1643 1.2812 2000 0.1978 0.3028 0.3591
0.0957 1.9218 3000 0.1956 0.2884 0.3526
0.0689 2.5625 4000 0.1948 0.2853 0.3522
0.2318 3.2031 5000 0.1992 0.2871 0.3680
0.1863 3.8437 6000 0.1985 0.2880 0.3629
0.0662 4.4843 7000 0.2037 0.3047 0.3826
0.234 5.1249 8000 0.1978 0.2872 0.3585
0.2175 5.7655 9000 0.1969 0.2786 0.3546
0.0557 6.4061 10000 0.2873 0.3668 0.2017
0.1808 7.0468 11000 0.2740 0.3486 0.1956
0.2526 7.6874 12000 0.2779 0.3553 0.1970
0.0698 8.3280 13000 0.2765 0.3520 0.1969
0.1459 8.9686 14000 0.2823 0.3546 0.1965
0.1818 9.6092 15000 0.2699 0.3441 0.1942
0.1141 10.2498 16000 0.2737 0.3515 0.1965
0.0851 10.8905 17000 0.2654 0.3494 0.1957
0.0612 11.5311 18000 0.2636 0.3478 0.1946
0.1456 12.1717 19000 0.2618 0.3431 0.1937
0.1322 12.8123 20000 0.2659 0.3495 0.1952
0.0377 13.4529 21000 0.2696 0.3462 0.1950
0.1161 14.0935 22000 0.2655 0.3435 0.1943
0.1446 14.7341 23000 0.2561 0.3418 0.1931
0.059 15.3748 24000 0.2668 0.3447 0.1937
0.1723 16.0154 25000 0.2654 0.3410 0.1940
0.1659 16.6560 26000 0.2635 0.3461 0.1947
0.0688 17.2966 27000 0.2602 0.3416 0.1928
0.0738 17.9372 28000 0.2732 0.3433 0.1936

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.6.0+cu126
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
7
Safetensors
Model size
606M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support