w2v-bert-2.0-hausa_250_250h
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2732
- Wer: 0.3433
- Cer: 0.1936
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 160
- eval_batch_size: 160
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 320
- total_eval_batch_size: 320
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Cer | Validation Loss | Wer |
---|---|---|---|---|---|
0.5401 | 0.6406 | 1000 | 0.2055 | 0.3666 | 0.3864 |
0.1643 | 1.2812 | 2000 | 0.1978 | 0.3028 | 0.3591 |
0.0957 | 1.9218 | 3000 | 0.1956 | 0.2884 | 0.3526 |
0.0689 | 2.5625 | 4000 | 0.1948 | 0.2853 | 0.3522 |
0.2318 | 3.2031 | 5000 | 0.1992 | 0.2871 | 0.3680 |
0.1863 | 3.8437 | 6000 | 0.1985 | 0.2880 | 0.3629 |
0.0662 | 4.4843 | 7000 | 0.2037 | 0.3047 | 0.3826 |
0.234 | 5.1249 | 8000 | 0.1978 | 0.2872 | 0.3585 |
0.2175 | 5.7655 | 9000 | 0.1969 | 0.2786 | 0.3546 |
0.0557 | 6.4061 | 10000 | 0.2873 | 0.3668 | 0.2017 |
0.1808 | 7.0468 | 11000 | 0.2740 | 0.3486 | 0.1956 |
0.2526 | 7.6874 | 12000 | 0.2779 | 0.3553 | 0.1970 |
0.0698 | 8.3280 | 13000 | 0.2765 | 0.3520 | 0.1969 |
0.1459 | 8.9686 | 14000 | 0.2823 | 0.3546 | 0.1965 |
0.1818 | 9.6092 | 15000 | 0.2699 | 0.3441 | 0.1942 |
0.1141 | 10.2498 | 16000 | 0.2737 | 0.3515 | 0.1965 |
0.0851 | 10.8905 | 17000 | 0.2654 | 0.3494 | 0.1957 |
0.0612 | 11.5311 | 18000 | 0.2636 | 0.3478 | 0.1946 |
0.1456 | 12.1717 | 19000 | 0.2618 | 0.3431 | 0.1937 |
0.1322 | 12.8123 | 20000 | 0.2659 | 0.3495 | 0.1952 |
0.0377 | 13.4529 | 21000 | 0.2696 | 0.3462 | 0.1950 |
0.1161 | 14.0935 | 22000 | 0.2655 | 0.3435 | 0.1943 |
0.1446 | 14.7341 | 23000 | 0.2561 | 0.3418 | 0.1931 |
0.059 | 15.3748 | 24000 | 0.2668 | 0.3447 | 0.1937 |
0.1723 | 16.0154 | 25000 | 0.2654 | 0.3410 | 0.1940 |
0.1659 | 16.6560 | 26000 | 0.2635 | 0.3461 | 0.1947 |
0.0688 | 17.2966 | 27000 | 0.2602 | 0.3416 | 0.1928 |
0.0738 | 17.9372 | 28000 | 0.2732 | 0.3433 | 0.1936 |
Framework versions
- Transformers 4.48.1
- Pytorch 2.6.0+cu126
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support