Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,70 @@ language:
|
|
7 |
metrics:
|
8 |
- wer
|
9 |
pipeline_tag: automatic-speech-recognition
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
metrics:
|
8 |
- wer
|
9 |
pipeline_tag: automatic-speech-recognition
|
10 |
+
---
|
11 |
+
|
12 |
+
# Wolof ASR Model (Based on Whisper-Small)
|
13 |
+
|
14 |
+
## Model Overview
|
15 |
+
|
16 |
+
This repository hosts an Automatic Speech Recognition (ASR) model for the Wolof language, fine-tuned from OpenAI's Whisper-small model. This model aims to provide accurate transcription of Wolof audio data.
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
|
20 |
+
- **Model Base**: Whisper-small
|
21 |
+
- **Loss**: 0.123
|
22 |
+
- **WER**: 0.17
|
23 |
+
|
24 |
+
## Dataset
|
25 |
+
|
26 |
+
The dataset used for training and evaluating this model is a collection from various sources, ensuring a rich and diverse set of Wolof audio samples. The collection is available in my Hugging Face account is used by keeping only the audios with duration shorter than 6 second.
|
27 |
+
|
28 |
+
- **Training Dataset**: 57 hours
|
29 |
+
- **Test Dataset**: 10 hours
|
30 |
+
|
31 |
+
For detailed information about the dataset, please refer to the [M9and2M/Wolof_ASR_dataset](https://huggingface.co/datasets/M9and2M/Wolof_ASR_dataset).
|
32 |
+
|
33 |
+
## Training
|
34 |
+
|
35 |
+
The training process was adapted from the code in the [Finetune Wa2vec 2.0 For Speech Recognition](https://github.com/khanld/ASR-Wa2vec-Finetune) written to fine-tune Wav2Vec2.0 for speech recognition. Special thanks to the author, Duy Khanh, Le for providing a robust and flexible training framework.
|
36 |
+
|
37 |
+
The model was trained with the following configuration:
|
38 |
+
|
39 |
+
- **Seed**: 19
|
40 |
+
- **Training Batch Size**: 1
|
41 |
+
- **Gradient Accumulation Steps**: 8
|
42 |
+
- **Number of GPUs**: 2
|
43 |
+
|
44 |
+
### Optimizer : AdamW
|
45 |
+
|
46 |
+
- **Learning Rate**: 1e-7
|
47 |
+
|
48 |
+
### Scheduler: OneCycleLR
|
49 |
+
|
50 |
+
- **Max Learning Rate**: 5e-5
|
51 |
+
|
52 |
+
## Acknowledgements
|
53 |
+
This model was built using OpenAI's Whisper-small architecture and fine-tuned with a dataset collected from various sources. Special thanks to the creators and contributors of the dataset.
|
54 |
+
|
55 |
+
|
56 |
+
## Citation [optional]
|
57 |
+
|
58 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
59 |
+
|
60 |
+
**BibTeX:**
|
61 |
+
|
62 |
+
[More Information Needed]
|
63 |
+
|
64 |
+
**APA:**
|
65 |
+
|
66 |
+
[More Information Needed]
|
67 |
+
|
68 |
+
|
69 |
+
## More Information
|
70 |
+
|
71 |
+
This model has been developed in the context of my Master Thesis at ETSIT-UPM, Madrid.
|
72 |
+
|
73 |
+
|
74 |
+
## Contact
|
75 |
+
|
76 |
+
For any inquiries or questions, please contact [email protected]
|