Mollel
/

ASR-Swahili-Small

@@ -1,83 +1,35 @@
 ---
-library_name: transformers
-license: apache-2.0
 base_model: openai/whisper-small
-tags:
-- generated_from_trainer
 datasets:
-- common_voice_17_0
-metrics:
-- wer
 model-index:
-- name: ASR-Swahili-Small
   results:
   - task:
-      name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: common_voice_17_0
-      type: common_voice_17_0
-      config: sw
-      split: test
-      args: sw
     metrics:
-    - name: Wer
-      type: wer
-      value: 53.40302063717767
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# ASR-Swahili-Small
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the common_voice_17_0 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.8496
-- Model Preparation Time: 0.003
-- Wer: 53.4030
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 32
-- eval_batch_size: 16
-- seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 128
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- num_epochs: 1
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Wer     |
-|:-------------:|:-----:|:----:|:---------------:|:----------------------:|:-------:|
-| 1.5828        | 0.32  | 50   | 1.1291          | 0.003                  | 63.7351 |
-| 0.8342        | 0.64  | 100  | 0.8994          | 0.003                  | 56.2661 |
-| 0.7015        | 0.96  | 150  | 0.8496          | 0.003                  | 53.4030 |
-### Framework versions
-- Transformers 4.49.0
-- Pytorch 2.6.0+cu124
-- Datasets 3.3.1
-- Tokenizers 0.21.0

 ---
 base_model: openai/whisper-small
 datasets:
+- mozilla-foundation/common_voice_17_0
+language: sw
+library_name: transformers
+license: apache-2.0
 model-index:
+- name: Finetuned openai/whisper-small on Swahili
   results:
   - task:
       type: automatic-speech-recognition
+      name: Speech-to-Text
     dataset:
+      name: Common Voice (Swahili)
+      type: common_voice
     metrics:
+    - type: wer
+      value: 53.403
 ---
+# Finetuned openai/whisper-small on 20000 Swahili training audio samples from mozilla-foundation/common_voice_17_0.
+This model was created from the Mozilla.ai Blueprint:
+[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
+## Evaluation results on 12253 audio samples of Swahili:
+### Baseline model (before finetuning) on Swahili
+- Word Error Rate: 133.795
+- Loss: 2.459
+### Finetuned model (after finetuning) on Swahili
+- Word Error Rate: 53.403
+- Loss: 0.85