Mollel commited on
Commit
3ba6654
·
verified ·
1 Parent(s): 880ea1d

Training in progress, step 50

Browse files
README.md CHANGED
@@ -1,35 +1,90 @@
1
  ---
2
- base_model: openai/whisper-small
3
- datasets:
4
- - mozilla-foundation/common_voice_17_0
5
- language: sw
6
  library_name: transformers
7
  license: apache-2.0
 
 
 
 
 
 
 
8
  model-index:
9
- - name: Finetuned openai/whisper-small on Swahili
10
  results:
11
  - task:
 
12
  type: automatic-speech-recognition
13
- name: Speech-to-Text
14
  dataset:
15
- name: Common Voice (Swahili)
16
- type: common_voice
 
 
 
17
  metrics:
18
- - type: wer
19
- value: 79.082
 
20
  ---
21
 
22
- # Finetuned openai/whisper-small on 2000 Swahili training audio samples from mozilla-foundation/common_voice_17_0.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- This model was created from the Mozilla.ai Blueprint:
25
- [speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
 
 
 
 
 
 
 
 
 
 
26
 
27
- ## Evaluation results on 2000 audio samples of Swahili:
28
 
29
- ### Baseline model (before finetuning) on Swahili
30
- - Word Error Rate: 139.455
31
- - Loss: 2.576
32
 
33
- ### Finetuned model (after finetuning) on Swahili
34
- - Word Error Rate: 79.082
35
- - Loss: 1.477
 
 
1
  ---
 
 
 
 
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: openai/whisper-small
5
+ tags:
6
+ - generated_from_trainer
7
+ datasets:
8
+ - common_voice_17_0
9
+ metrics:
10
+ - wer
11
  model-index:
12
+ - name: ASR-Swahili-Small
13
  results:
14
  - task:
15
+ name: Automatic Speech Recognition
16
  type: automatic-speech-recognition
 
17
  dataset:
18
+ name: common_voice_17_0
19
+ type: common_voice_17_0
20
+ config: sw
21
+ split: test
22
+ args: sw
23
  metrics:
24
+ - name: Wer
25
+ type: wer
26
+ value: 79.08239916791864
27
  ---
28
 
29
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
+ should probably proofread and complete it, then remove this comment. -->
31
+
32
+ # ASR-Swahili-Small
33
+
34
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the common_voice_17_0 dataset.
35
+ It achieves the following results on the evaluation set:
36
+ - Loss: 1.4765
37
+ - Model Preparation Time: 0.003
38
+ - Wer: 79.0824
39
+
40
+ ## Model description
41
+
42
+ More information needed
43
+
44
+ ## Intended uses & limitations
45
+
46
+ More information needed
47
+
48
+ ## Training and evaluation data
49
+
50
+ More information needed
51
+
52
+ ## Training procedure
53
+
54
+ ### Training hyperparameters
55
+
56
+ The following hyperparameters were used during training:
57
+ - learning_rate: 1e-05
58
+ - train_batch_size: 4
59
+ - eval_batch_size: 8
60
+ - seed: 42
61
+ - gradient_accumulation_steps: 4
62
+ - total_train_batch_size: 16
63
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
64
+ - lr_scheduler_type: linear
65
+ - lr_scheduler_warmup_steps: 9
66
+ - training_steps: 50
67
+ - mixed_precision_training: Native AMP
68
+
69
+ ### Training results
70
 
71
+ | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Wer |
72
+ |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:--------:|
73
+ | 2.4744 | 0.04 | 5 | 2.4647 | 0.003 | 133.2948 |
74
+ | 2.3612 | 0.08 | 10 | 2.1132 | 0.003 | 116.4914 |
75
+ | 1.9312 | 0.12 | 15 | 1.8395 | 0.003 | 90.6969 |
76
+ | 1.5686 | 0.16 | 20 | 1.6659 | 0.003 | 87.3628 |
77
+ | 1.5144 | 0.2 | 25 | 1.5895 | 0.003 | 82.8730 |
78
+ | 1.4267 | 0.24 | 30 | 1.5432 | 0.003 | 84.9070 |
79
+ | 1.492 | 0.28 | 35 | 1.5229 | 0.003 | 84.5718 |
80
+ | 1.3699 | 0.32 | 40 | 1.4999 | 0.003 | 82.3009 |
81
+ | 1.2679 | 0.36 | 45 | 1.4842 | 0.003 | 80.5848 |
82
+ | 1.3613 | 0.4 | 50 | 1.4765 | 0.003 | 79.0824 |
83
 
 
84
 
85
+ ### Framework versions
 
 
86
 
87
+ - Transformers 4.49.0
88
+ - Pytorch 2.6.0+cu124
89
+ - Datasets 3.3.1
90
+ - Tokenizers 0.21.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6d191909ce6fa88fa3008a69517b2c3055bfcdcf59a562aef101356f264f707b
3
  size 966995080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec88b11a1d1ac7410edff5145bb7d994b2ead4a42cadc040d2d763bd59532f89
3
  size 966995080
runs/Feb27_01-47-15_ai4d-Lambda-Vector/events.out.tfevents.1740610489.ai4d-Lambda-Vector.1052975.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07e6730f5589180413897b987a45b6440f072f1e91e95e2c34d1cd6ce9c9f73c
3
+ size 7967
runs/Feb27_02-15-13_ai4d-Lambda-Vector/events.out.tfevents.1740613905.ai4d-Lambda-Vector.1113201.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d641afed01a135bbee18bfb404f24eb8aeb89d468c5693a2b7dcab914248b1b
3
+ size 7384
runs/Feb27_03-09-55_ai4d-Lambda-Vector/events.out.tfevents.1740617299.ai4d-Lambda-Vector.1238951.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cf5ba34a45efc696e171dac424a38e6690dde6c4ca114a819dd4207e70e2435
3
+ size 7765
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:70b0397ab7f149ad8d901fad9dc0d786c7b921134d84a5f3fadfb2168544b5f6
3
  size 5496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:48f921cced55cf902a140bcbd2244853c66a2b1fd78f76f701653cbf83d66c20
3
  size 5496