Beijuka commited on
Commit
71506d5
·
verified ·
1 Parent(s): f6cf615

Update src/streamlit_app.py

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +24 -8
src/streamlit_app.py CHANGED
@@ -217,15 +217,32 @@ with tab1:
217
  except pyperclip.PyperclipException:
218
  st.error("Could not copy automatically. Please copy manually.")
219
 
220
- with tab6:
221
  st.header("Results: WER vs Dataset Size")
222
 
 
 
223
  st.write("""
224
- Overall, the Word Error Rate (WER) decreases as the number of training hours increases across all models and languages.
225
- This highlights the importance of dataset size in improving ASR performance, although the rate of improvement varies
226
- significantly between models.
 
227
  """)
228
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
229
  # XLS-R
230
  st.subheader("XLS-R")
231
  st.write("""
@@ -259,9 +276,8 @@ with tab6:
259
  st.image("src/Images/mmslog.png", caption="Log WER vs Training Hours for MMS")
260
 
261
  # Overall Insight
262
- st.subheader("Overall Insights")
263
  st.write("""
264
- - All models exhibit the largest WER improvements when training data is scarce.
265
- - Beyond a certain dataset size, adding more data results in marginal gains.
266
- - Dataset size remains a critical factor, but its impact plateaus once the model is trained on sufficient data.
267
  """)
 
217
  except pyperclip.PyperclipException:
218
  st.error("Could not copy automatically. Please copy manually.")
219
 
220
+ with tab6:
221
  st.header("Results: WER vs Dataset Size")
222
 
223
+ # --- Introduction ---
224
+ st.subheader("Introduction")
225
  st.write("""
226
+ Automatic Speech Recognition (ASR) for African languages remains challenging due to the scarcity of labeled data and limited methodological guidance for low-resource settings. While interest in multilingual and low-resource ASR is growing, there is still limited understanding of how different pretrained models perform across diverse African languages, data sizes, and decoding strategies.
227
+
228
+ In this study, we benchmark four state-of-the-art ASR models, Wav2Vec2 XLS-R, Whisper, MMS, and W2V-BERT, across 17 African languages representing East, West, and Southern Africa. These include Luganda, Swahili, Kinyarwanda, Wolof, Akan, Ewe, Xhosa, Lingala, Amharic, Bambara, Bemba, Zulu, Igbo, Shona, Afrikaans, Hausa, and Fula. Our findings contribute empirical insights into model robustness and data efficiency in low-resource scenarios.
229
+
230
  """)
231
 
232
+ # --- Datasets ---
233
+ st.subheader("Datasets")
234
+ st.write("""
235
+ We trained each ASR model on 1, 5, 10, 20, 50, 100, 200 and 400-hour splits, based on labelled data available perlanguage. For Wav2Vec2-XLS-R and W2V-BERT, we also trained 5-gram language models using available textual data to assess the impact of language model integration
236
+
237
+ """)
238
+
239
+ # --- Results ---
240
+ st.subheader("Results")
241
+ st.write("""
242
+ Overall, the Word Error Rate (WER) decreases as the number of training hours increases across all models and
243
+ languages. This highlights the importance of dataset size in improving ASR performance, although the rate of
244
+ improvement varies significantly between models.
245
+ """)
246
  # XLS-R
247
  st.subheader("XLS-R")
248
  st.write("""
 
276
  st.image("src/Images/mmslog.png", caption="Log WER vs Training Hours for MMS")
277
 
278
  # Overall Insight
279
+ st.subheader("Takeaways")
280
  st.write("""
281
+ Model performance generally improves with more training data, but performance gains become smaller after 100 hours for some languages. Language model are more effective when training data is limited especially below 5o hours, but their impact reduces as data increases, with some variation across languages.
282
+
 
283
  """)