Beijuka commited on
Commit
9788c5f
Β·
verified Β·
1 Parent(s): 1dff7ff

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +53 -23
src/streamlit_app.py CHANGED
@@ -21,7 +21,7 @@ tab1, tab2, tab3, tab4, tab5, tab6, tab7 = st.tabs([
21
  "Model Collections",
22
  "Evaluation Scenarios",
23
  "ASR models demo",
24
- "Results",
25
  "Human Evaluation of ASR Models"
26
  ])
27
 
@@ -194,32 +194,62 @@ with tab1:
194
  We will build a test set that can be used for benchmarking ASR models in some of the 30 most spoken African languages. The benchmark dataset will be structured to consist of unique MP3 files and corresponding text files. We will ensure as much as possible that the benchmark datasets are as diverse as possible with dataset characteristics like gender, age, accent, variant, vocabulary, acoustic characteristics to help improve the accuracy of speech recognition models. The speech benchmark dataset will be reviewed, deemed highly quality, and split into dev, test and train sets. Due to the largely acoustic nature of African languages (mostly tonal, diacritical, etc.), a careful speech analysis of African languages is necessary and the benchmark dataset is important to spur more research in the African context.
195
 
196
  """)
197
- # Citation
198
- CITATION_TEXT = """@misc{asr-africa-2025,
199
- title = {Automatic Speech Recognition for African Languages},
200
- author = {Dr Joyce Nakatumba-Nabende, Dr Peter Nabende, Dr Andrew Katumba, Alvin Nahabwe},
201
- year = 2025,
202
- publisher = {Hugging Face},
203
- howpublished = "\\url{https://huggingface.co/spaces/asr-africa/Automatic_Speech_Recognition_for_African_Languages}"
204
- }"""
205
 
206
- with st.expander("πŸ“™ Citation", expanded=False):
207
- st.text_area(
208
- "BibTeX snippet to cite this source",
209
- value=CITATION_TEXT,
210
- height=150,
211
- disabled=True
212
- )
213
 
214
- if st.button("πŸ“‹ Copy to Clipboard"):
215
- try:
216
- pyperclip.copy(CITATION_TEXT)
217
- st.success("Citation copied to clipboard!")
218
- except pyperclip.PyperclipException:
219
- st.error("Could not copy automatically. Please copy manually.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
220
 
 
 
 
 
 
221
  with tab6:
222
- st.header("Results: WER vs Dataset Size")
223
 
224
  # --- Introduction ---
225
  st.subheader("Introduction")
 
21
  "Model Collections",
22
  "Evaluation Scenarios",
23
  "ASR models demo",
24
+ "Quantitative Results",
25
  "Human Evaluation of ASR Models"
26
  ])
27
 
 
194
  We will build a test set that can be used for benchmarking ASR models in some of the 30 most spoken African languages. The benchmark dataset will be structured to consist of unique MP3 files and corresponding text files. We will ensure as much as possible that the benchmark datasets are as diverse as possible with dataset characteristics like gender, age, accent, variant, vocabulary, acoustic characteristics to help improve the accuracy of speech recognition models. The speech benchmark dataset will be reviewed, deemed highly quality, and split into dev, test and train sets. Due to the largely acoustic nature of African languages (mostly tonal, diacritical, etc.), a careful speech analysis of African languages is necessary and the benchmark dataset is important to spur more research in the African context.
195
 
196
  """)
197
+ # # Citation
198
+ # CITATION_TEXT = """@misc{asr-africa-2025,
199
+ # title = {Automatic Speech Recognition for African Languages},
200
+ # author = {Dr Joyce Nakatumba-Nabende, Dr Peter Nabende, Dr Andrew Katumba, Alvin Nahabwe},
201
+ # year = 2025,
202
+ # publisher = {Hugging Face},
203
+ # howpublished = "\\url{https://huggingface.co/spaces/asr-africa/Automatic_Speech_Recognition_for_African_Languages}"
204
+ # }"""
205
 
206
+ # with st.expander("πŸ“™ Citation", expanded=False):
207
+ # st.text_area(
208
+ # "BibTeX snippet to cite this source",
209
+ # value=CITATION_TEXT,
210
+ # height=150,
211
+ # disabled=True
212
+ # )
213
 
214
+ # if st.button("πŸ“‹ Copy to Clipboard"):
215
+ # try:
216
+ # pyperclip.copy(CITATION_TEXT)
217
+ # st.success("Citation copied to clipboard!")
218
+ # except pyperclip.PyperclipException:
219
+ # st.error("Could not copy automatically. Please copy manually.")
220
+
221
+ # --- Platform preview for About tab ---
222
+ st.markdown("""
223
+ ## Platform overview
224
+
225
+ A preview of what the platform contains and how to navigate. Use the links and tabs in the top navigation to jump to demos, datasets, results, or evaluation details.
226
+
227
+ 1. **Benchmark Datasets:**
228
+ A multilingual collection covering over **17 African languages**, built from open corpora (e.g., Common Voice, Fleurs, NCHLT, ALFFA, Naija Voices).
229
+ Each dataset is cleaned, validated, and partitioned into training, development, and test splits to ensure fair benchmarking.
230
+
231
+ 2. **Model Collections:**
232
+ Fine-tuned ASR models derived from **Wav2Vec2 XLS-R**, **Whisper**, **MMS**, and **W2V-BERT**, adapted for African phonetic, tonal, and orthographic features.
233
+ These are hosted as public collections on [Hugging Face](https://huggingface.co/asr-africa).
234
+
235
+ 3. **Evaluation Scenarios:**
236
+ Designed to test **data efficiency**, **domain adaptation**, and **speech-type robustness** β€” e.g., how models generalize from read speech to spontaneous dialogue,
237
+ or from education to agricultural domains.
238
+
239
+ 4. **ASR Demo Interface:**
240
+ A **Gradio-powered live testing tool**, allowing users to upload or record audio, view transcriptions, and submit structured feedback via the integrated backend API.
241
+
242
+ 5. **Quantitative Results:**
243
+ Comprehensive analysis of model performance across training hours and data scales (1–400 hours), visualized through **Word Error Rate (WER)** and **Character Error Rate (CER)** trends.
244
+ Findings show clear **data scaling laws**, with XLS-R and W2V-BERT models performing best under low-resource conditions.
245
 
246
+ 6. **Human Evaluation Framework:**
247
+ A structured qualitative evaluation conducted with **20 native-language evaluators** across 12 languages.
248
+ Evaluators assessed **accuracy**, **meaning preservation**, **orthography**, and **error types** (e.g., named entities, punctuation, diacritics).
249
+ This data is publicly available in the curated [ASR_Evaluation_dataset](https://huggingface.co/datasets/asr-africa/ASR_Evaluation_dataset).
250
+ """)
251
  with tab6:
252
+ st.header("Quantitative Results: WER vs Dataset Size")
253
 
254
  # --- Introduction ---
255
  st.subheader("Introduction")