whisper-small-ru-v4 / README.md
artyomboyko's picture
Update README.md
f7ca886 verified
metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-small
tags:
  - audio
  - automatic-speech-recognition
  - generated_from_trainer
widget:
  - example_title: Librispeech sample 1
    src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
  - example_title: Librispeech sample 2
    src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
metrics:
  - wer
  - cer
model-index:
  - name: whisper-small-ru-v4
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 21.0
          type: mozilla-foundation/common_voice_21_0
          config: ru
          split: test
          args:
            language: ru
        metrics:
          - name: Wer
            type: wer
            value: 2.065
          - name: Cer
            type: cer
            value: 0.9906
language:
  - ru
pipeline_tag: automatic-speech-recognition
datasets:
  - artyomboyko/common_voice_21_0_ru

whisper-small-ru-v4

NOTE: EXPERIMENTAL MODEL!
This is the best model obtained at the end of the fine-tuning process. Further inference testing has not yet been performed.

This model is a fine-tuned version of artyomboyko/whisper-small-ru-v3 (which in turn is a fine-tuned version of the base model openai/whisper-small) on an Common Voice 21 RU dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0104
  • Wer: 2.0650
  • Cer: 0.9906

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Training on 1 x MSI Suprim 4090

Training procedure

Model training time: 28h 47m

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 250
  • training_steps: 25000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.0683 0.0387 500 0.1521 13.4494 4.4901
0.059 0.0774 1000 0.1434 12.1396 3.6132
0.0584 0.1161 1500 0.1382 11.9180 3.3839
0.0551 0.1547 2000 0.1314 11.2753 3.3867
0.0513 0.1934 2500 0.1242 10.6755 3.0711
0.0616 0.2321 3000 0.1199 10.8194 3.3670
0.0524 0.2708 3500 0.1130 10.0340 2.8311
0.0465 0.3095 4000 0.1057 10.0108 3.1744
0.0588 0.3482 4500 0.1026 10.1871 3.4398
0.0498 0.3868 5000 0.0951 8.9527 2.7278
0.0488 0.4255 5500 0.0915 9.2033 3.0227
0.0501 0.4642 6000 0.0876 8.8043 2.7854
0.0428 0.5029 6500 0.0835 8.3066 2.6446
0.0463 0.5416 7000 0.0793 7.5861 2.3860
0.0516 0.5803 7500 0.0752 7.8959 2.6551
0.0442 0.6190 8000 0.0702 7.5687 2.4814
0.0393 0.6576 8500 0.0655 7.0072 2.1594
0.0455 0.6963 9000 0.0606 6.4202 1.9970
0.0371 0.7350 9500 0.0567 6.7253 2.2651
0.041 0.7737 10000 0.0524 6.4851 2.1622
0.0368 0.8124 10500 0.0497 5.4596 1.5878
0.0397 0.8511 11000 0.0455 5.7566 2.1294
0.0342 0.8897 11500 0.0429 5.1382 1.6793
0.0322 0.9284 12000 0.0382 4.7786 1.5893
0.0316 0.9671 12500 0.0349 5.3842 2.3248
0.008 1.0058 13000 0.0315 4.2403 1.2860
0.0122 1.0445 13500 0.0303 4.7983 2.0351
0.0118 1.0832 14000 0.0285 4.9955 2.4634
0.0121 1.1219 14500 0.0285 5.0744 2.1732
0.01 1.1605 15000 0.0271 4.5906 1.8766
0.0093 1.1992 15500 0.0261 3.6103 1.3770
0.0102 1.2379 16000 0.0251 4.0651 1.5117
0.0106 1.2766 16500 0.0242 4.3899 1.8827
0.0089 1.3153 17000 0.0234 3.7252 1.3949
0.0078 1.3540 17500 0.0223 3.7217 1.6103
0.0091 1.3926 18000 0.0216 3.8284 1.6104
0.0096 1.4313 18500 0.0200 3.2519 1.5155
0.0083 1.4700 19000 0.0188 3.3168 1.3898
0.0072 1.5087 19500 0.0176 3.1231 1.4695
0.0083 1.5474 20000 0.0166 3.6625 1.6818
0.0111 1.5861 20500 0.0155 2.5152 1.1298
0.0068 1.6248 21000 0.0149 2.4142 0.9976
0.0055 1.6634 21500 0.0141 2.6451 1.3030
0.0123 1.7021 22000 0.0132 2.6289 1.2809
0.0079 1.7408 22500 0.0126 2.2576 0.9550
0.0112 1.7795 23000 0.0119 2.6149 1.3460
0.0087 1.8182 23500 0.0114 2.2878 1.1265
0.0062 1.8569 24000 0.0109 2.1903 1.0690
0.0051 1.8956 24500 0.0106 2.1277 1.0283
0.0077 1.9342 25000 0.0104 2.0650 0.9906

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.1