Automatic Speech Recognition
Transformers
Safetensors
meralion2
meralion
meralion-2
custom_code
zxl commited on
Commit
ee73e2d
·
verified ·
1 Parent(s): da18003

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +663 -689
README.md CHANGED
@@ -69,696 +69,670 @@ The model is specifically adapted to handle the linguistic nuances, accents, and
69
  ## 📈 Evaluations:
70
  We benchmark MERaLiON-2 series models with extended [AudioBench benchmark](https://github.com/AudioLLMs/AudioBench) | [LeaderBoard](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) against several recently released open-source multimodal models — SALMONN-7B, Qwen2.5-Omni series and Phi-4-Multimodal — as well as two cascade model. The MERaLiON-2 series models shows stronger performance on a wide range of audio/speech understanding tasks.
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  **Automatic Speech Recognition (ASR) results**
73
- <div class="table*">
74
- <table>
75
- <thead>
76
- <tr>
77
- <th style="text-align: center;"><strong>type</strong></th>
78
- <th style="text-align: center;"><strong>dataset</strong></th>
79
- <th style="text-align: center;"><strong>MERaLiON-1</strong></th>
80
- <th style="text-align: center;"><strong>MERaLiON-2-3B</strong></th>
81
- <th style="text-align: center;"><strong>MERaLiON-2-10B</strong></th>
82
- <th style="text-align: center; background-color: #06a2a2;"><strong>MERaLiON-2-10B-ASR</strong></th>
83
- <th style="text-align: center;"><strong>MERaLiON-2-Whisper </strong></th>
84
- <th style="text-align: center;"><strong>whisper_large_v3</strong></th>
85
- <th style="text-align: center;"><strong>Phi-4-multimodal-instruct</strong></th>
86
- <th style="text-align: center;"><strong>Qwen2.5-Omni-3B</strong></th>
87
- <th style="text-align: center;"><strong>Qwen2.5-Omni-7B</strong></th>
88
- <th style="text-align: center;"><strong>SALMONN-7B</strong></th>
89
- <th style="text-align: center;"><strong>cascade-whisper_v2+sealion</strong></th>
90
- <th style="text-align: center;"><strong>cascade-whisper_v3+llama</strong></th>
91
- </tr>
92
- </thead>
93
- <tbody>
94
- <tr>
95
- <td class="column style5 s style7" rowspan="10">English</td>
96
- <td style="text-align: center;"><strong>common_voice_15_en</strong></td>
97
- <td style="text-align: center;">0.078</td>
98
- <td style="text-align: center;">0.093</td>
99
- <td style="text-align: center;">0.087</td>
100
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.076</u></strong></td>
101
- <td style="text-align: center;">0.102</td>
102
- <td style="text-align: center;">0.100</td>
103
- <td style="text-align: center;">0.081</td>
104
- <td style="text-align: center;">0.094</td>
105
- <td style="text-align: center;">0.080</td>
106
- <td style="text-align: center;">0.316</td>
107
- <td style="text-align: center;">0.106</td>
108
- <td style="text-align: center;">0.099</td>
109
- </tr>
110
- <tr>
111
- <td style="text-align: center;"><strong>earnings21</strong></td>
112
- <td style="text-align: center;">0.138</td>
113
- <td style="text-align: center;">0.219</td>
114
- <td style="text-align: center;">0.108</td>
115
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.092</u></strong></td>
116
- <td style="text-align: center;">0.130</td>
117
- <td style="text-align: center;">0.132</td>
118
- <td style="text-align: center;">0.131</td>
119
- <td style="text-align: center;">0.147</td>
120
- <td style="text-align: center;">0.189</td>
121
- <td style="text-align: center;">0.277</td>
122
- <td style="text-align: center;">0.141</td>
123
- <td style="text-align: center;">0.109</td>
124
- </tr>
125
- <tr>
126
- <td style="text-align: center;"><strong>earnings22</strong></td>
127
- <td style="text-align: center;">0.166</td>
128
- <td style="text-align: center;">0.239</td>
129
- <td style="text-align: center;">0.151</td>
130
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.128</u></strong></td>
131
- <td style="text-align: center;">0.168</td>
132
- <td style="text-align: center;">0.165</td>
133
- <td style="text-align: center;">0.226</td>
134
- <td style="text-align: center;">0.197</td>
135
- <td style="text-align: center;">0.241</td>
136
- <td style="text-align: center;">0.380</td>
137
- <td style="text-align: center;">0.172</td>
138
- <td style="text-align: center;">0.146</td>
139
- </tr>
140
- <tr>
141
- <td style="text-align: center;"><strong>gigaspeech</strong></td>
142
- <td style="text-align: center;">0.145</td>
143
- <td style="text-align: center;">0.092</td>
144
- <td style="text-align: center;">0.090</td>
145
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.088</u></strong></td>
146
- <td style="text-align: center;">0.089</td>
147
- <td style="text-align: center;">0.098</td>
148
- <td style="text-align: center;">0.099</td>
149
- <td style="text-align: center;">0.114</td>
150
- <td style="text-align: center;">0.140</td>
151
- <td style="text-align: center;">0.110</td>
152
- <td style="text-align: center;">0.100</td>
153
- <td style="text-align: center;">0.095</td>
154
- </tr>
155
- <tr>
156
- <td style="text-align: center;"><strong>librispeech_clean</strong></td>
157
- <td style="text-align: center;">0.024</td>
158
- <td style="text-align: center;">0.027</td>
159
- <td style="text-align: center;">0.025</td>
160
- <td style="text-align: center; background-color: #06a2a2;">0.021</td>
161
- <td style="text-align: center;">0.020</td>
162
- <td style="text-align: center;">0.022</td>
163
- <td style="text-align: center;"><strong><u>0.017</u></strong></td>
164
- <td style="text-align: center;">0.021</td>
165
- <td style="text-align: center;">0.044</td>
166
- <td style="text-align: center;">0.096</td>
167
- <td style="text-align: center;">0.033</td>
168
- <td style="text-align: center;">0.018</td>
169
- </tr>
170
- <tr>
171
- <td style="text-align: center;"><strong>librispeech_other</strong></td>
172
- <td style="text-align: center;">0.042</td>
173
- <td style="text-align: center;">0.051</td>
174
- <td style="text-align: center;">0.047</td>
175
- <td style="text-align: center; background-color: #06a2a2;">0.040</td>
176
- <td style="text-align: center;">0.044</td>
177
- <td style="text-align: center;">0.039</td>
178
- <td style="text-align: center;">0.039</td>
179
- <td style="text-align: center;">0.045</td>
180
- <td style="text-align: center;">0.069</td>
181
- <td style="text-align: center;">0.118</td>
182
- <td style="text-align: center;">0.054</td>
183
- <td style="text-align: center;"><strong><u>0.036</u></strong></td>
184
- </tr>
185
- <tr>
186
- <td style="text-align: center;"><strong>peoples_speech</strong></td>
187
- <td style="text-align: center;">0.216</td>
188
- <td style="text-align: center;">0.206</td>
189
- <td style="text-align: center;">0.205</td>
190
- <td style="text-align: center; background-color: #06a2a2;">0.196</td>
191
- <td style="text-align: center;">0.197</td>
192
- <td style="text-align: center;">0.150</td>
193
- <td style="text-align: center;">0.215</td>
194
- <td style="text-align: center;">0.262</td>
195
- <td style="text-align: center;">0.312</td>
196
- <td style="text-align: center;">0.242</td>
197
- <td style="text-align: center;">0.203</td>
198
- <td style="text-align: center;"><strong><u>0.145</u></strong></td>
199
- </tr>
200
- <tr>
201
- <td style="text-align: center;"><strong>tedlium3</strong></td>
202
- <td style="text-align: center;">0.082</td>
203
- <td style="text-align: center;">0.035</td>
204
- <td style="text-align: center;">0.035</td>
205
- <td style="text-align: center; background-color: #06a2a2;">0.031</td>
206
- <td style="text-align: center;">0.036</td>
207
- <td style="text-align: center;">0.041</td>
208
- <td style="text-align: center;"><strong><u>0.029</u></strong></td>
209
- <td style="text-align: center;">0.048</td>
210
- <td style="text-align: center;">0.049</td>
211
- <td style="text-align: center;">0.039</td>
212
- <td style="text-align: center;">0.049</td>
213
- <td style="text-align: center;">0.038</td>
214
- </tr>
215
- <tr>
216
- <td style="text-align: center;"><strong>tedlium3_long_form</strong></td>
217
- <td style="text-align: center;">0.105</td>
218
- <td style="text-align: center;">0.138</td>
219
- <td style="text-align: center;">0.044</td>
220
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.035</u></strong></td>
221
- <td style="text-align: center;">0.048</td>
222
- <td style="text-align: center;">0.045</td>
223
- <td style="text-align: center;">0.051</td>
224
- <td style="text-align: center;">0.071</td>
225
- <td style="text-align: center;">0.084</td>
226
- <td style="text-align: center;">0.141</td>
227
- <td style="text-align: center;">0.086</td>
228
- <td style="text-align: center;">0.049</td>
229
- </tr>
230
- <tr>
231
- <td style="text-align: center;"><strong>average</strong></td>
232
- <td style="text-align: center;">0.111</td>
233
- <td style="text-align: center;">0.122</td>
234
- <td style="text-align: center;">0.088</td>
235
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.079</u></strong></td>
236
- <td style="text-align: center;">0.093</td>
237
- <td style="text-align: center;">0.088</td>
238
- <td style="text-align: center;">0.098</td>
239
- <td style="text-align: center;">0.111</td>
240
- <td style="text-align: center;">0.134</td>
241
- <td style="text-align: center;">0.191</td>
242
- <td style="text-align: center;">0.105</td>
243
- <td style="text-align: center;">0.082</td>
244
- </tr>
245
- <tr>
246
- <td class="column style5 s style7" rowspan="14">Inhouse</td>
247
- <td style="text-align: center;"><strong>cna</strong></td>
248
- <td style="text-align: center;">0.145</td>
249
- <td style="text-align: center;">0.135</td>
250
- <td style="text-align: center;">0.133</td>
251
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.127</u></strong></td>
252
- <td style="text-align: center;">0.128</td>
253
- <td style="text-align: center;">0.138</td>
254
- <td style="text-align: center;">0.191</td>
255
- <td style="text-align: center;">0.174</td>
256
- <td style="text-align: center;">0.183</td>
257
- <td style="text-align: center;">0.149</td>
258
- <td style="text-align: center;">0.152</td>
259
- <td style="text-align: center;">0.138</td>
260
- </tr>
261
- <tr>
262
- <td style="text-align: center;"><strong>idpc</strong></td>
263
- <td style="text-align: center;">0.204</td>
264
- <td style="text-align: center;">0.177</td>
265
- <td style="text-align: center;"><strong><u>0.160</u></strong></td>
266
- <td style="text-align: center; background-color: #06a2a2;">0.166</td>
267
- <td style="text-align: center;">0.169</td>
268
- <td style="text-align: center;">0.179</td>
269
- <td style="text-align: center;">0.261</td>
270
- <td style="text-align: center;">0.199</td>
271
- <td style="text-align: center;">0.220</td>
272
- <td style="text-align: center;">0.541</td>
273
- <td style="text-align: center;">0.170</td>
274
- <td style="text-align: center;">0.162</td>
275
- </tr>
276
- <tr>
277
- <td style="text-align: center;"><strong>idpc_short</strong></td>
278
- <td style="text-align: center;">0.165</td>
279
- <td style="text-align: center;">0.151</td>
280
- <td style="text-align: center;">0.157</td>
281
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.140</u></strong></td>
282
- <td style="text-align: center;">0.152</td>
283
- <td style="text-align: center;">0.220</td>
284
- <td style="text-align: center;">0.539</td>
285
- <td style="text-align: center;">0.211</td>
286
- <td style="text-align: center;">0.414</td>
287
- <td style="text-align: center;">0.240</td>
288
- <td style="text-align: center;">0.197</td>
289
- <td style="text-align: center;">0.153</td>
290
- </tr>
291
- <tr>
292
- <td style="text-align: center;"><strong>mediacorp</strong></td>
293
- <td style="text-align: center;">0.123</td>
294
- <td style="text-align: center;">0.123</td>
295
- <td style="text-align: center;">0.105</td>
296
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.104</u></strong></td>
297
- <td style="text-align: center;">0.116</td>
298
- <td style="text-align: center;">0.129</td>
299
- <td style="text-align: center;">0.198</td>
300
- <td style="text-align: center;">0.152</td>
301
- <td style="text-align: center;">0.235</td>
302
- <td style="text-align: center;">0.364</td>
303
- <td style="text-align: center;">0.158</td>
304
- <td style="text-align: center;">0.151</td>
305
- </tr>
306
- <tr>
307
- <td style="text-align: center;"><strong>mediacorp_short</strong></td>
308
- <td style="text-align: center;">0.128</td>
309
- <td style="text-align: center;">0.121</td>
310
- <td style="text-align: center;">0.117</td>
311
- <td style="text-align: center; background-color: #06a2a2;">0.118</td>
312
- <td style="text-align: center;">0.122</td>
313
- <td style="text-align: center;">0.127</td>
314
- <td style="text-align: center;">0.122</td>
315
- <td style="text-align: center;">0.148</td>
316
- <td style="text-align: center;">0.141</td>
317
- <td style="text-align: center;">0.199</td>
318
- <td style="text-align: center;">0.154</td>
319
- <td style="text-align: center;"><strong><u>0.114</u></strong></td>
320
- </tr>
321
- <tr>
322
- <td style="text-align: center;"><strong>parliament</strong></td>
323
- <td style="text-align: center;">0.059</td>
324
- <td style="text-align: center;">0.185</td>
325
- <td style="text-align: center;">0.060</td>
326
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.053</u></strong></td>
327
- <td style="text-align: center;">0.078</td>
328
- <td style="text-align: center;">0.090</td>
329
- <td style="text-align: center;">0.278</td>
330
- <td style="text-align: center;">0.100</td>
331
- <td style="text-align: center;">0.110</td>
332
- <td style="text-align: center;">0.204</td>
333
- <td style="text-align: center;">0.090</td>
334
- <td style="text-align: center;">0.065</td>
335
- </tr>
336
- <tr>
337
- <td style="text-align: center;"><strong>ste</strong></td>
338
- <td style="text-align: center;">0.159</td>
339
- <td style="text-align: center;">0.263</td>
340
- <td style="text-align: center;">0.147</td>
341
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.125</u></strong></td>
342
- <td style="text-align: center;">0.151</td>
343
- <td style="text-align: center;">0.298</td>
344
- <td style="text-align: center;">0.297</td>
345
- <td style="text-align: center;">0.287</td>
346
- <td style="text-align: center;">0.288</td>
347
- <td style="text-align: center;">0.422</td>
348
- <td style="text-align: center;">0.132</td>
349
- <td style="text-align: center;">0.144</td>
350
- </tr>
351
- <tr>
352
- <td style="text-align: center;"><strong>ukusnews</strong></td>
353
- <td style="text-align: center;">0.113</td>
354
- <td style="text-align: center;">0.174</td>
355
- <td style="text-align: center;">0.070</td>
356
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.056</u></strong></td>
357
- <td style="text-align: center;">0.083</td>
358
- <td style="text-align: center;">0.123</td>
359
- <td style="text-align: center;">0.075</td>
360
- <td style="text-align: center;">0.091</td>
361
- <td style="text-align: center;">0.176</td>
362
- <td style="text-align: center;">0.192</td>
363
- <td style="text-align: center;">0.123</td>
364
- <td style="text-align: center;">0.089</td>
365
- </tr>
366
- <tr>
367
- <td style="text-align: center;"><strong>ytb_asr_batch1</strong></td>
368
- <td style="text-align: center;">0.107</td>
369
- <td style="text-align: center;">0.099</td>
370
- <td style="text-align: center;">0.098</td>
371
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.092</u></strong></td>
372
- <td style="text-align: center;">0.112</td>
373
- <td style="text-align: center;">0.133</td>
374
- <td style="text-align: center;">0.169</td>
375
- <td style="text-align: center;">0.162</td>
376
- <td style="text-align: center;">0.174</td>
377
- <td style="text-align: center;">0.221</td>
378
- <td style="text-align: center;">0.125</td>
379
- <td style="text-align: center;">0.108</td>
380
- </tr>
381
- <tr>
382
- <td style="text-align: center;"><strong>ytb_asr_batch2</strong></td>
383
- <td style="text-align: center;">0.133</td>
384
- <td style="text-align: center;">0.160</td>
385
- <td style="text-align: center;">0.111</td>
386
- <td style="text-align: center; background-color: #06a2a2;">0.099</td>
387
- <td style="text-align: center;">0.118</td>
388
- <td style="text-align: center;">0.129</td>
389
- <td style="text-align: center;">0.232</td>
390
- <td style="text-align: center;">0.245</td>
391
- <td style="text-align: center;">0.351</td>
392
- <td style="text-align: center;">0.350</td>
393
- <td style="text-align: center;">0.126</td>
394
- <td style="text-align: center;"><strong><u>0.084</u></strong></td>
395
- </tr>
396
- <tr>
397
- <td style="text-align: center;"><strong>ytb_asr_batch3_chinese</strong></td>
398
- <td style="text-align: center;">0.418</td>
399
- <td style="text-align: center;">0.256</td>
400
- <td style="text-align: center;">0.191</td>
401
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.149</u></strong></td>
402
- <td style="text-align: center;">0.177</td>
403
- <td style="text-align: center;">0.266</td>
404
- <td style="text-align: center;">0.440</td>
405
- <td style="text-align: center;">0.250</td>
406
- <td style="text-align: center;">0.206</td>
407
- <td style="text-align: center;">0.886</td>
408
- <td style="text-align: center;">0.347</td>
409
- <td style="text-align: center;">0.270</td>
410
- </tr>
411
- <tr>
412
- <td style="text-align: center;"><strong>ytb_asr_batch3_malay</strong></td>
413
- <td style="text-align: center;">0.290</td>
414
- <td style="text-align: center;">0.280</td>
415
- <td style="text-align: center;">0.209</td>
416
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.195</u></strong></td>
417
- <td style="text-align: center;">0.290</td>
418
- <td style="text-align: center;">0.260</td>
419
- <td style="text-align: center;">3.763</td>
420
- <td style="text-align: center;">2.944</td>
421
- <td style="text-align: center;">1.461</td>
422
- <td style="text-align: center;">1.086</td>
423
- <td style="text-align: center;">0.314</td>
424
- <td style="text-align: center;">0.312</td>
425
- </tr>
426
- <tr>
427
- <td style="text-align: center;"><strong>ytb_asr_batch3_tamil</strong></td>
428
- <td style="text-align: center;">0.693</td>
429
- <td style="text-align: center;">0.750</td>
430
- <td style="text-align: center;">0.664</td>
431
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.547</u></strong></td>
432
- <td style="text-align: center;">0.927</td>
433
- <td style="text-align: center;">0.841</td>
434
- <td style="text-align: center;">2.750</td>
435
- <td style="text-align: center;">1.461</td>
436
- <td style="text-align: center;">1.362</td>
437
- <td style="text-align: center;">0.985</td>
438
- <td style="text-align: center;">0.967</td>
439
- <td style="text-align: center;">0.898</td>
440
- </tr>
441
- <tr>
442
- <td style="text-align: center;"><strong>average</strong></td>
443
- <td style="text-align: center;">0.210</td>
444
- <td style="text-align: center;">0.221</td>
445
- <td style="text-align: center;">0.171</td>
446
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.152</u></strong></td>
447
- <td style="text-align: center;">0.202</td>
448
- <td style="text-align: center;">0.226</td>
449
- <td style="text-align: center;">0.717</td>
450
- <td style="text-align: center;">0.494</td>
451
- <td style="text-align: center;">0.409</td>
452
- <td style="text-align: center;">0.449</td>
453
- <td style="text-align: center;">0.235</td>
454
- <td style="text-align: center;">0.207</td>
455
- </tr>
456
- <tr>
457
- <td class="column style5 s style7" rowspan="3">Mandarin</td>
458
- <td style="text-align: center;"><strong>aishell_asr_zh</strong></td>
459
- <td style="text-align: center;">0.128</td>
460
- <td style="text-align: center;">0.050</td>
461
- <td style="text-align: center;">0.058</td>
462
- <td style="text-align: center; background-color: #06a2a2;">0.043</td>
463
- <td style="text-align: center;">0.056</td>
464
- <td style="text-align: center;">0.123</td>
465
- <td style="text-align: center;">0.122</td>
466
- <td style="text-align: center;">0.028</td>
467
- <td style="text-align: center;"><strong><u>0.024</u></strong></td>
468
- <td style="text-align: center;">0.931</td>
469
- <td style="text-align: center;">0.209</td>
470
- <td style="text-align: center;">0.125</td>
471
- </tr>
472
- <tr>
473
- <td style="text-align: center;"><strong>commonvoice_zh</strong></td>
474
- <td style="text-align: center;">0.327</td>
475
- <td style="text-align: center;">0.131</td>
476
- <td style="text-align: center;">0.147</td>
477
- <td style="text-align: center; background-color: #06a2a2;">0.118</td>
478
- <td style="text-align: center;">0.141</td>
479
- <td style="text-align: center;">0.198</td>
480
- <td style="text-align: center;">0.154</td>
481
- <td style="text-align: center;">0.113</td>
482
- <td style="text-align: center;"><strong><u>0.076</u></strong></td>
483
- <td style="text-align: center;">1.001</td>
484
- <td style="text-align: center;">0.319</td>
485
- <td style="text-align: center;">0.196</td>
486
- </tr>
487
- <tr>
488
- <td style="text-align: center;"><strong>average</strong></td>
489
- <td style="text-align: center;">0.228</td>
490
- <td style="text-align: center;">0.091</td>
491
- <td style="text-align: center;">0.102</td>
492
- <td style="text-align: center; background-color: #06a2a2;">0.081</td>
493
- <td style="text-align: center;">0.098</td>
494
- <td style="text-align: center;">0.161</td>
495
- <td style="text-align: center;">0.138</td>
496
- <td style="text-align: center;">0.071</td>
497
- <td style="text-align: center;"><strong><u>0.050</u></strong></td>
498
- <td style="text-align: center;">0.966</td>
499
- <td style="text-align: center;">0.264</td>
500
- <td style="text-align: center;">0.160</td>
501
- </tr>
502
- <tr>
503
- <td class="column style5 s style7" rowspan="10">SEA languages</td>
504
- <td style="text-align: center;"><strong>commonvoice_id</strong></td>
505
- <td style="text-align: center;">0.260</td>
506
- <td style="text-align: center;">0.085</td>
507
- <td style="text-align: center;">0.113</td>
508
- <td style="text-align: center; background-color: #06a2a2;">0.079</td>
509
- <td style="text-align: center;"><strong><u>0.069</u></strong></td>
510
- <td style="text-align: center;">0.075</td>
511
- <td style="text-align: center;">1.327</td>
512
- <td style="text-align: center;">0.136</td>
513
- <td style="text-align: center;">0.110</td>
514
- <td style="text-align: center;">1.189</td>
515
- <td style="text-align: center;">0.100</td>
516
- <td style="text-align: center;">0.078</td>
517
- </tr>
518
- <tr>
519
- <td style="text-align: center;"><strong>commonvoice_ta</strong></td>
520
- <td style="text-align: center;">0.528</td>
521
- <td style="text-align: center;">0.139</td>
522
- <td style="text-align: center;">0.156</td>
523
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.129</u></strong></td>
524
- <td style="text-align: center;">0.195</td>
525
- <td style="text-align: center;">0.271</td>
526
- <td style="text-align: center;">1.178</td>
527
- <td style="text-align: center;">0.831</td>
528
- <td style="text-align: center;">0.847</td>
529
- <td style="text-align: center;">1.427</td>
530
- <td style="text-align: center;">0.238</td>
531
- <td style="text-align: center;">0.244</td>
532
- </tr>
533
- <tr>
534
- <td style="text-align: center;"><strong>commonvoice_th</strong></td>
535
- <td style="text-align: center;">0.847</td>
536
- <td style="text-align: center;">0.307</td>
537
- <td style="text-align: center;">0.466</td>
538
- <td style="text-align: center; background-color: #06a2a2;">0.635</td>
539
- <td style="text-align: center;"><strong><u>0.051</u></strong></td>
540
- <td style="text-align: center;">0.069</td>
541
- <td style="text-align: center;">1.054</td>
542
- <td style="text-align: center;">0.113</td>
543
- <td style="text-align: center;">0.104</td>
544
- <td style="text-align: center;">1.044</td>
545
- <td style="text-align: center;">0.093</td>
546
- <td style="text-align: center;">0.064</td>
547
- </tr>
548
- <tr>
549
- <td style="text-align: center;"><strong>commonvoice_vi</strong></td>
550
- <td style="text-align: center;">0.922</td>
551
- <td style="text-align: center;">0.142</td>
552
- <td style="text-align: center;">0.156</td>
553
- <td style="text-align: center; background-color: #06a2a2;">0.142</td>
554
- <td style="text-align: center;">0.118</td>
555
- <td style="text-align: center;">0.129</td>
556
- <td style="text-align: center;">1.107</td>
557
- <td style="text-align: center;">0.196</td>
558
- <td style="text-align: center;">0.184</td>
559
- <td style="text-align: center;">1.496</td>
560
- <td style="text-align: center;">0.157</td>
561
- <td style="text-align: center;"><strong><u>0.117</u></strong></td>
562
- </tr>
563
- <tr>
564
- <td style="text-align: center;"><strong>fleurs_tamil_ta</strong></td>
565
- <td style="text-align: center;">0.462</td>
566
- <td style="text-align: center;">0.143</td>
567
- <td style="text-align: center;">0.161</td>
568
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.138</u></strong></td>
569
- <td style="text-align: center;">0.224</td>
570
- <td style="text-align: center;">0.276</td>
571
- <td style="text-align: center;">1.702</td>
572
- <td style="text-align: center;">1.654</td>
573
- <td style="text-align: center;">0.867</td>
574
- <td style="text-align: center;">1.508</td>
575
- <td style="text-align: center;">0.272</td>
576
- <td style="text-align: center;">0.284</td>
577
- </tr>
578
- <tr>
579
- <td style="text-align: center;"><strong>gigaspeech2_id</strong></td>
580
- <td style="text-align: center;">0.337</td>
581
- <td style="text-align: center;">0.178</td>
582
- <td style="text-align: center;">0.172</td>
583
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.163</u></strong></td>
584
- <td style="text-align: center;">0.185</td>
585
- <td style="text-align: center;">0.196</td>
586
- <td style="text-align: center;">5.804</td>
587
- <td style="text-align: center;">0.275</td>
588
- <td style="text-align: center;">0.227</td>
589
- <td style="text-align: center;">2.118</td>
590
- <td style="text-align: center;">0.219</td>
591
- <td style="text-align: center;">0.193</td>
592
- </tr>
593
- <tr>
594
- <td style="text-align: center;"><strong>gigaspeech2_th</strong></td>
595
- <td style="text-align: center;">0.987</td>
596
- <td style="text-align: center;">0.200</td>
597
- <td style="text-align: center;">0.200</td>
598
- <td style="text-align: center; background-color: #06a2a2;">0.182</td>
599
- <td style="text-align: center;"><strong><u>0.171</u></strong></td>
600
- <td style="text-align: center;">0.222</td>
601
- <td style="text-align: center;">1.734</td>
602
- <td style="text-align: center;">0.300</td>
603
- <td style="text-align: center;">0.232</td>
604
- <td style="text-align: center;">1.247</td>
605
- <td style="text-align: center;">0.276</td>
606
- <td style="text-align: center;">0.209</td>
607
- </tr>
608
- <tr>
609
- <td style="text-align: center;"><strong>gigaspeech2_vi</strong></td>
610
- <td style="text-align: center;">0.982</td>
611
- <td style="text-align: center;">0.168</td>
612
- <td style="text-align: center;">0.113</td>
613
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.095</u></strong></td>
614
- <td style="text-align: center;">0.127</td>
615
- <td style="text-align: center;">0.177</td>
616
- <td style="text-align: center;">2.504</td>
617
- <td style="text-align: center;">0.177</td>
618
- <td style="text-align: center;">0.227</td>
619
- <td style="text-align: center;">1.546</td>
620
- <td style="text-align: center;">0.171</td>
621
- <td style="text-align: center;">0.155</td>
622
- </tr>
623
- <tr>
624
- <td style="text-align: center;"><strong>lotus_thai_th</strong></td>
625
- <td style="text-align: center;">0.852</td>
626
- <td style="text-align: center;">0.015</td>
627
- <td style="text-align: center;">0.019</td>
628
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.011</u></strong></td>
629
- <td style="text-align: center;">0.026</td>
630
- <td style="text-align: center;">0.039</td>
631
- <td style="text-align: center;">1.286</td>
632
- <td style="text-align: center;">0.026</td>
633
- <td style="text-align: center;">0.021</td>
634
- <td style="text-align: center;">1.135</td>
635
- <td style="text-align: center;">0.068</td>
636
- <td style="text-align: center;">0.032</td>
637
- </tr>
638
- <tr>
639
- <td style="text-align: center;"><strong>average</strong></td>
640
- <td style="text-align: center;">0.686</td>
641
- <td style="text-align: center;">0.153</td>
642
- <td style="text-align: center;">0.173</td>
643
- <td style="text-align: center; background-color: #06a2a2;">0.175</td>
644
- <td style="text-align: center;"><strong><u>0.129</u></strong></td>
645
- <td style="text-align: center;">0.162</td>
646
- <td style="text-align: center;">1.966</td>
647
- <td style="text-align: center;">0.412</td>
648
- <td style="text-align: center;">0.313</td>
649
- <td style="text-align: center;">1.412</td>
650
- <td style="text-align: center;">0.177</td>
651
- <td style="text-align: center;">0.153</td>
652
- </tr>
653
- <tr>
654
- <td class="column style5 s style7" rowspan="7">Singlish</td>
655
- <td style="text-align: center;"><strong>imda_part1_asr</strong></td>
656
- <td style="text-align: center;"><strong><u>0.043</u></strong></td>
657
- <td style="text-align: center;">0.049</td>
658
- <td style="text-align: center;">0.052</td>
659
- <td style="text-align: center; background-color: #06a2a2;">0.044</td>
660
- <td style="text-align: center;">0.052</td>
661
- <td style="text-align: center;">0.069</td>
662
- <td style="text-align: center;">0.058</td>
663
- <td style="text-align: center;">0.053</td>
664
- <td style="text-align: center;">0.053</td>
665
- <td style="text-align: center;">0.093</td>
666
- <td style="text-align: center;">0.071</td>
667
- <td style="text-align: center;">0.069</td>
668
- </tr>
669
- <tr>
670
- <td style="text-align: center;"><strong>imda_part2_asr</strong></td>
671
- <td style="text-align: center;"><strong><u>0.047</u></strong></td>
672
- <td style="text-align: center;">0.058</td>
673
- <td style="text-align: center;">0.145</td>
674
- <td style="text-align: center; background-color: #06a2a2;">0.054</td>
675
- <td style="text-align: center;">0.080</td>
676
- <td style="text-align: center;">0.318</td>
677
- <td style="text-align: center;">0.345</td>
678
- <td style="text-align: center;">0.095</td>
679
- <td style="text-align: center;">0.094</td>
680
- <td style="text-align: center;">0.458</td>
681
- <td style="text-align: center;">0.330</td>
682
- <td style="text-align: center;">0.319</td>
683
- </tr>
684
- <tr>
685
- <td style="text-align: center;"><strong>imda_part3_30s_asr</strong></td>
686
- <td style="text-align: center;">0.213</td>
687
- <td style="text-align: center;">0.264</td>
688
- <td style="text-align: center;">0.227</td>
689
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.196</u></strong></td>
690
- <td style="text-align: center;">0.211</td>
691
- <td style="text-align: center;">0.320</td>
692
- <td style="text-align: center;">0.438</td>
693
- <td style="text-align: center;">0.475</td>
694
- <td style="text-align: center;">0.535</td>
695
- <td style="text-align: center;">0.681</td>
696
- <td style="text-align: center;">0.281</td>
697
- <td style="text-align: center;">0.277</td>
698
- </tr>
699
- <tr>
700
- <td style="text-align: center;"><strong>imda_part4_30s_asr</strong></td>
701
- <td style="text-align: center;">0.297</td>
702
- <td style="text-align: center;">0.360</td>
703
- <td style="text-align: center;">0.295</td>
704
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.246</u></strong></td>
705
- <td style="text-align: center;">0.271</td>
706
- <td style="text-align: center;">0.503</td>
707
- <td style="text-align: center;">1.470</td>
708
- <td style="text-align: center;">1.250</td>
709
- <td style="text-align: center;">1.303</td>
710
- <td style="text-align: center;">0.787</td>
711
- <td style="text-align: center;">0.459</td>
712
- <td style="text-align: center;">0.458</td>
713
- </tr>
714
- <tr>
715
- <td style="text-align: center;"><strong>imda_part5_30s_asr</strong></td>
716
- <td style="text-align: center;">0.154</td>
717
- <td style="text-align: center;">0.202</td>
718
- <td style="text-align: center;">0.168</td>
719
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.140</u></strong></td>
720
- <td style="text-align: center;">0.149</td>
721
- <td style="text-align: center;">0.237</td>
722
- <td style="text-align: center;">0.239</td>
723
- <td style="text-align: center;">0.280</td>
724
- <td style="text-align: center;">0.374</td>
725
- <td style="text-align: center;">0.375</td>
726
- <td style="text-align: center;">0.218</td>
727
- <td style="text-align: center;">0.214</td>
728
- </tr>
729
- <tr>
730
- <td style="text-align: center;"><strong>imda_part6_30s_asr</strong></td>
731
- <td style="text-align: center;">0.109</td>
732
- <td style="text-align: center;">0.149</td>
733
- <td style="text-align: center;">0.127</td>
734
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.099</u></strong></td>
735
- <td style="text-align: center;">0.110</td>
736
- <td style="text-align: center;">0.198</td>
737
- <td style="text-align: center;">0.144</td>
738
- <td style="text-align: center;">0.183</td>
739
- <td style="text-align: center;">0.275</td>
740
- <td style="text-align: center;">0.255</td>
741
- <td style="text-align: center;">0.175</td>
742
- <td style="text-align: center;">0.172</td>
743
- </tr>
744
- <tr>
745
- <td style="text-align: center;"><strong>average</strong></td>
746
- <td style="text-align: center;">0.144</td>
747
- <td style="text-align: center;">0.180</td>
748
- <td style="text-align: center;">0.169</td>
749
- <td style="text-align: center; background-color: #06a2a2;"><strong><u>0.130</u></strong></td>
750
- <td style="text-align: center;">0.145</td>
751
- <td style="text-align: center;">0.274</td>
752
- <td style="text-align: center;">0.449</td>
753
- <td style="text-align: center;">0.389</td>
754
- <td style="text-align: center;">0.439</td>
755
- <td style="text-align: center;">0.441</td>
756
- <td style="text-align: center;">0.256</td>
757
- <td style="text-align: center;">0.252</td>
758
- </tr>
759
- </tbody>
760
- </table>
761
- </div>
762
 
763
 
764
  ## 🔧 How to Use
 
69
  ## 📈 Evaluations:
70
  We benchmark MERaLiON-2 series models with extended [AudioBench benchmark](https://github.com/AudioLLMs/AudioBench) | [LeaderBoard](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) against several recently released open-source multimodal models — SALMONN-7B, Qwen2.5-Omni series and Phi-4-Multimodal — as well as two cascade model. The MERaLiON-2 series models shows stronger performance on a wide range of audio/speech understanding tasks.
71
 
72
+ <style type="text/css">
73
+
74
+ td.s {
75
+ font-weight: bold;
76
+ text-align: center;
77
+ }
78
+
79
+ td.style3 {
80
+ text-align: center;
81
+ }
82
+
83
+ td.style4 {
84
+ text-align: center;
85
+ font-weight: bold;
86
+ }
87
+
88
+ td.column5 {
89
+ text-align: center;
90
+ background-color: #06a2a2;
91
+ }
92
+
93
+ </style>
94
+
95
  **Automatic Speech Recognition (ASR) results**
96
+ <table class="sheet0 gridlines">
97
+ <tbody>
98
+ <tr class="row0">
99
+ <td class="column0 style5 s style1">task_type</td>
100
+ <td class="column1 style1 s">dataset</td>
101
+ <td class="column2 style2 s">MERaLiON-1</td>
102
+ <td class="column3 style2 s">MERaLiON-2-3B</td>
103
+ <td class="column4 style2 s">MERaLiON-2-10B</td>
104
+ <td class="column5 style2 s">MERaLiON-2-10B-ASR</td>
105
+ <td class="column6 style2 s">Phi-4-multimodal-instruct</td>
106
+ <td class="column7 style2 s">Qwen2.5-Omni-3B</td>
107
+ <td class="column8 style2 s">Qwen2.5-Omni-7B</td>
108
+ <td class="column9 style2 s">SeaLLMs-Audio-7B</td>
109
+ <td class="column10 style2 s">SALMONN-7B</td>
110
+ <td class="column11 style2 s">cascade-whisper_v2+sealion</td>
111
+ <td class="column12 style2 s">cascade-whisper_v3+llama</td>
112
+ </tr>
113
+ <tr class="row1">
114
+ <td class="column0 style5 s" rowspan="10">English</td>
115
+ <td class="column1 style1 s">common_voice_15_en</td>
116
+ <td class="column2 style3 n">0.077</td>
117
+ <td class="column3 style3 n">0.093</td>
118
+ <td class="column4 style3 n">0.090</td>
119
+ <td class="column5 style4 n">0.076</td>
120
+ <td class="column6 style3 n">0.079</td>
121
+ <td class="column7 style3 n">0.088</td>
122
+ <td class="column8 style3 n">0.080</td>
123
+ <td class="column9 style3 n">0.158</td>
124
+ <td class="column10 style3 n">0.320</td>
125
+ <td class="column11 style3 n">0.105</td>
126
+ <td class="column12 style3 n">0.098</td>
127
+ </tr>
128
+ <tr class="row2">
129
+ <td class="column1 style1 s">earnings21</td>
130
+ <td class="column2 style3 n">0.138</td>
131
+ <td class="column3 style3 n">0.219</td>
132
+ <td class="column4 style3 n">0.108</td>
133
+ <td class="column5 style4 n">0.092</td>
134
+ <td class="column6 style3 n">0.131</td>
135
+ <td class="column7 style3 n">0.147</td>
136
+ <td class="column8 style3 n">0.189</td>
137
+ <td class="column9 style3 n">0.379</td>
138
+ <td class="column10 style3 n">0.277</td>
139
+ <td class="column11 style3 n">0.141</td>
140
+ <td class="column12 style3 n">0.109</td>
141
+ </tr>
142
+ <tr class="row3">
143
+ <td class="column1 style1 s">earnings22</td>
144
+ <td class="column2 style3 n">0.166</td>
145
+ <td class="column3 style3 n">0.239</td>
146
+ <td class="column4 style3 n">0.151</td>
147
+ <td class="column5 style4 n">0.128</td>
148
+ <td class="column6 style3 n">0.226</td>
149
+ <td class="column7 style3 n">0.197</td>
150
+ <td class="column8 style3 n">0.241</td>
151
+ <td class="column9 style3 n">0.456</td>
152
+ <td class="column10 style3 n">0.380</td>
153
+ <td class="column11 style3 n">0.172</td>
154
+ <td class="column12 style3 n">0.146</td>
155
+ </tr>
156
+ <tr class="row4">
157
+ <td class="column1 style1 s">gigaspeech</td>
158
+ <td class="column2 style3 n">0.145</td>
159
+ <td class="column3 style3 n">0.092</td>
160
+ <td class="column4 style3 n">0.090</td>
161
+ <td class="column5 style4 n">0.088</td>
162
+ <td class="column6 style3 n">0.099</td>
163
+ <td class="column7 style3 n">0.114</td>
164
+ <td class="column8 style3 n">0.140</td>
165
+ <td class="column9 style3 n">0.127</td>
166
+ <td class="column10 style3 n">0.110</td>
167
+ <td class="column11 style3 n">0.100</td>
168
+ <td class="column12 style3 n">0.095</td>
169
+ </tr>
170
+ <tr class="row5">
171
+ <td class="column1 style1 s">librispeech_clean</td>
172
+ <td class="column2 style3 n">0.024</td>
173
+ <td class="column3 style3 n">0.027</td>
174
+ <td class="column4 style3 n">0.025</td>
175
+ <td class="column5 style3 n">0.021</td>
176
+ <td class="column6 style4 n">0.017</td>
177
+ <td class="column7 style3 n">0.021</td>
178
+ <td class="column8 style3 n">0.044</td>
179
+ <td class="column9 style3 n">0.051</td>
180
+ <td class="column10 style3 n">0.096</td>
181
+ <td class="column11 style3 n">0.033</td>
182
+ <td class="column12 style3 n">0.018</td>
183
+ </tr>
184
+ <tr class="row6">
185
+ <td class="column1 style1 s">librispeech_other</td>
186
+ <td class="column2 style3 n">0.042</td>
187
+ <td class="column3 style3 n">0.051</td>
188
+ <td class="column4 style3 n">0.047</td>
189
+ <td class="column5 style3 n">0.040</td>
190
+ <td class="column6 style3 n">0.039</td>
191
+ <td class="column7 style3 n">0.045</td>
192
+ <td class="column8 style3 n">0.069</td>
193
+ <td class="column9 style3 n">0.097</td>
194
+ <td class="column10 style3 n">0.118</td>
195
+ <td class="column11 style3 n">0.054</td>
196
+ <td class="column12 style4 n">0.036</td>
197
+ </tr>
198
+ <tr class="row7">
199
+ <td class="column1 style1 s">peoples_speech</td>
200
+ <td class="column2 style3 n">0.216</td>
201
+ <td class="column3 style3 n">0.206</td>
202
+ <td class="column4 style3 n">0.205</td>
203
+ <td class="column5 style3 n">0.196</td>
204
+ <td class="column6 style3 n">0.215</td>
205
+ <td class="column7 style3 n">0.262</td>
206
+ <td class="column8 style3 n">0.312</td>
207
+ <td class="column9 style3 n">0.375</td>
208
+ <td class="column10 style3 n">0.242</td>
209
+ <td class="column11 style3 n">0.203</td>
210
+ <td class="column12 style4 n">0.145</td>
211
+ </tr>
212
+ <tr class="row8">
213
+ <td class="column1 style1 s">tedlium3</td>
214
+ <td class="column2 style3 n">0.082</td>
215
+ <td class="column3 style3 n">0.035</td>
216
+ <td class="column4 style3 n">0.035</td>
217
+ <td class="column5 style3 n">0.031</td>
218
+ <td class="column6 style4 n">0.029</td>
219
+ <td class="column7 style3 n">0.048</td>
220
+ <td class="column8 style3 n">0.049</td>
221
+ <td class="column9 style3 n">0.047</td>
222
+ <td class="column10 style3 n">0.039</td>
223
+ <td class="column11 style3 n">0.049</td>
224
+ <td class="column12 style3 n">0.038</td>
225
+ </tr>
226
+ <tr class="row9">
227
+ <td class="column1 style1 s">tedlium3_long_form</td>
228
+ <td class="column2 style3 n">0.105</td>
229
+ <td class="column3 style3 n">0.138</td>
230
+ <td class="column4 style3 n">0.044</td>
231
+ <td class="column5 style4 n">0.035</td>
232
+ <td class="column6 style3 n">0.051</td>
233
+ <td class="column7 style3 n">0.071</td>
234
+ <td class="column8 style3 n">0.084</td>
235
+ <td class="column9 style3 n">0.090</td>
236
+ <td class="column10 style3 n">0.141</td>
237
+ <td class="column11 style3 n">0.086</td>
238
+ <td class="column12 style3 n">0.049</td>
239
+ </tr>
240
+ <tr class="row10">
241
+ <td class="column1 style1 s">average</td>
242
+ <td class="column2 style3 n">0.111</td>
243
+ <td class="column3 style3 n">0.122</td>
244
+ <td class="column4 style3 n">0.088</td>
245
+ <td class="column5 style4 n">0.079</td>
246
+ <td class="column6 style3 n">0.098</td>
247
+ <td class="column7 style3 n">0.110</td>
248
+ <td class="column8 style3 n">0.134</td>
249
+ <td class="column9 style3 n">0.198</td>
250
+ <td class="column10 style3 n">0.191</td>
251
+ <td class="column11 style3 n">0.105</td>
252
+ <td class="column12 style3 n">0.082</td>
253
+ </tr>
254
+ <tr class="row11">
255
+ <td class="column0 style5 s" rowspan="14">Inhouse</td>
256
+ <td class="column1 style1 s">cna</td>
257
+ <td class="column2 style3 n">0.145</td>
258
+ <td class="column3 style3 n">0.135</td>
259
+ <td class="column4 style3 n">0.133</td>
260
+ <td class="column5 style4 n">0.127</td>
261
+ <td class="column6 style3 n">0.191</td>
262
+ <td class="column7 style3 n">0.174</td>
263
+ <td class="column8 style3 n">0.183</td>
264
+ <td class="column9 style3 n">0.273</td>
265
+ <td class="column10 style3 n">0.149</td>
266
+ <td class="column11 style3 n">0.152</td>
267
+ <td class="column12 style3 n">0.138</td>
268
+ </tr>
269
+ <tr class="row12">
270
+ <td class="column1 style1 s">idpc</td>
271
+ <td class="column2 style3 n">0.204</td>
272
+ <td class="column3 style3 n">0.177</td>
273
+ <td class="column4 style4 n">0.160</td>
274
+ <td class="column5 style3 n">0.166</td>
275
+ <td class="column6 style3 n">0.261</td>
276
+ <td class="column7 style3 n">0.199</td>
277
+ <td class="column8 style3 n">0.220</td>
278
+ <td class="column9 style3 n">1.165</td>
279
+ <td class="column10 style3 n">0.541</td>
280
+ <td class="column11 style3 n">0.170</td>
281
+ <td class="column12 style3 n">0.162</td>
282
+ </tr>
283
+ <tr class="row13">
284
+ <td class="column1 style1 s">idpc_short</td>
285
+ <td class="column2 style3 n">0.165</td>
286
+ <td class="column3 style3 n">0.151</td>
287
+ <td class="column4 style3 n">0.157</td>
288
+ <td class="column5 style4 n">0.140</td>
289
+ <td class="column6 style3 n">0.539</td>
290
+ <td class="column7 style3 n">0.211</td>
291
+ <td class="column8 style3 n">0.414</td>
292
+ <td class="column9 style3 n">0.719</td>
293
+ <td class="column10 style3 n">0.240</td>
294
+ <td class="column11 style3 n">0.197</td>
295
+ <td class="column12 style3 n">0.153</td>
296
+ </tr>
297
+ <tr class="row14">
298
+ <td class="column1 style1 s">mediacorp</td>
299
+ <td class="column2 style3 n">0.123</td>
300
+ <td class="column3 style3 n">0.123</td>
301
+ <td class="column4 style3 n">0.105</td>
302
+ <td class="column5 style4 n">0.104</td>
303
+ <td class="column6 style3 n">0.198</td>
304
+ <td class="column7 style3 n">0.152</td>
305
+ <td class="column8 style3 n">0.235</td>
306
+ <td class="column9 style3 n">0.400</td>
307
+ <td class="column10 style3 n">0.364</td>
308
+ <td class="column11 style3 n">0.158</td>
309
+ <td class="column12 style3 n">0.151</td>
310
+ </tr>
311
+ <tr class="row15">
312
+ <td class="column1 style1 s">mediacorp_short</td>
313
+ <td class="column2 style3 n">0.128</td>
314
+ <td class="column3 style3 n">0.121</td>
315
+ <td class="column4 style3 n">0.117</td>
316
+ <td class="column5 style3 n">0.118</td>
317
+ <td class="column6 style3 n">0.122</td>
318
+ <td class="column7 style3 n">0.148</td>
319
+ <td class="column8 style3 n">0.141</td>
320
+ <td class="column9 style3 n">0.188</td>
321
+ <td class="column10 style3 n">0.199</td>
322
+ <td class="column11 style3 n">0.154</td>
323
+ <td class="column12 style4 n">0.114</td>
324
+ </tr>
325
+ <tr class="row16">
326
+ <td class="column1 style1 s">parliament</td>
327
+ <td class="column2 style3 n">0.059</td>
328
+ <td class="column3 style3 n">0.185</td>
329
+ <td class="column4 style3 n">0.060</td>
330
+ <td class="column5 style4 n">0.053</td>
331
+ <td class="column6 style3 n">0.278</td>
332
+ <td class="column7 style3 n">0.100</td>
333
+ <td class="column8 style3 n">0.110</td>
334
+ <td class="column9 style3 n">0.194</td>
335
+ <td class="column10 style3 n">0.204</td>
336
+ <td class="column11 style3 n">0.090</td>
337
+ <td class="column12 style3 n">0.065</td>
338
+ </tr>
339
+ <tr class="row17">
340
+ <td class="column1 style1 s">ste</td>
341
+ <td class="column2 style3 n">0.159</td>
342
+ <td class="column3 style3 n">0.263</td>
343
+ <td class="column4 style3 n">0.147</td>
344
+ <td class="column5 style4 n">0.125</td>
345
+ <td class="column6 style3 n">0.297</td>
346
+ <td class="column7 style3 n">0.287</td>
347
+ <td class="column8 style3 n">0.288</td>
348
+ <td class="column9 style3 n">0.734</td>
349
+ <td class="column10 style3 n">0.422</td>
350
+ <td class="column11 style3 n">0.132</td>
351
+ <td class="column12 style3 n">0.144</td>
352
+ </tr>
353
+ <tr class="row18">
354
+ <td class="column1 style1 s">ukusnews</td>
355
+ <td class="column2 style3 n">0.113</td>
356
+ <td class="column3 style3 n">0.174</td>
357
+ <td class="column4 style3 n">0.070</td>
358
+ <td class="column5 style4 n">0.056</td>
359
+ <td class="column6 style3 n">0.075</td>
360
+ <td class="column7 style3 n">0.091</td>
361
+ <td class="column8 style3 n">0.176</td>
362
+ <td class="column9 style3 n">0.176</td>
363
+ <td class="column10 style3 n">0.192</td>
364
+ <td class="column11 style3 n">0.123</td>
365
+ <td class="column12 style3 n">0.089</td>
366
+ </tr>
367
+ <tr class="row19">
368
+ <td class="column1 style1 s">ytb_asr_batch1</td>
369
+ <td class="column2 style3 n">0.107</td>
370
+ <td class="column3 style3 n">0.099</td>
371
+ <td class="column4 style3 n">0.098</td>
372
+ <td class="column5 style4 n">0.092</td>
373
+ <td class="column6 style3 n">0.169</td>
374
+ <td class="column7 style3 n">0.162</td>
375
+ <td class="column8 style3 n">0.174</td>
376
+ <td class="column9 style3 n">0.450</td>
377
+ <td class="column10 style3 n">0.221</td>
378
+ <td class="column11 style3 n">0.125</td>
379
+ <td class="column12 style3 n">0.108</td>
380
+ </tr>
381
+ <tr class="row20">
382
+ <td class="column1 style1 s">ytb_asr_batch2</td>
383
+ <td class="column2 style3 n">0.133</td>
384
+ <td class="column3 style3 n">0.160</td>
385
+ <td class="column4 style3 n">0.111</td>
386
+ <td class="column5 style3 n">0.099</td>
387
+ <td class="column6 style3 n">0.232</td>
388
+ <td class="column7 style3 n">0.245</td>
389
+ <td class="column8 style3 n">0.351</td>
390
+ <td class="column9 style3 n">0.904</td>
391
+ <td class="column10 style3 n">0.350</td>
392
+ <td class="column11 style3 n">0.126</td>
393
+ <td class="column12 style4 n">0.084</td>
394
+ </tr>
395
+ <tr class="row21">
396
+ <td class="column1 style1 s">ytb_asr_batch3_chinese</td>
397
+ <td class="column2 style3 n">0.418</td>
398
+ <td class="column3 style3 n">0.256</td>
399
+ <td class="column4 style3 n">0.191</td>
400
+ <td class="column5 style4 n">0.149</td>
401
+ <td class="column6 style3 n">0.440</td>
402
+ <td class="column7 style3 n">0.250</td>
403
+ <td class="column8 style3 n">0.206</td>
404
+ <td class="column9 style3 n">0.662</td>
405
+ <td class="column10 style3 n">0.886</td>
406
+ <td class="column11 style3 n">0.347</td>
407
+ <td class="column12 style3 n">0.270</td>
408
+ </tr>
409
+ <tr class="row22">
410
+ <td class="column1 style1 s">ytb_asr_batch3_malay</td>
411
+ <td class="column2 style3 n">0.290</td>
412
+ <td class="column3 style3 n">0.280</td>
413
+ <td class="column4 style3 n">0.209</td>
414
+ <td class="column5 style4 n">0.195</td>
415
+ <td class="column6 style3 n">3.763</td>
416
+ <td class="column7 style3 n">2.944</td>
417
+ <td class="column8 style3 n">1.461</td>
418
+ <td class="column9 style3 n">0.766</td>
419
+ <td class="column10 style3 n">1.086</td>
420
+ <td class="column11 style3 n">0.314</td>
421
+ <td class="column12 style3 n">0.312</td>
422
+ </tr>
423
+ <tr class="row23">
424
+ <td class="column1 style1 s">ytb_asr_batch3_tamil</td>
425
+ <td class="column2 style3 n">0.693</td>
426
+ <td class="column3 style3 n">0.750</td>
427
+ <td class="column4 style3 n">0.664</td>
428
+ <td class="column5 style4 n">0.547</td>
429
+ <td class="column6 style3 n">2.750</td>
430
+ <td class="column7 style3 n">1.461</td>
431
+ <td class="column8 style3 n">1.362</td>
432
+ <td class="column9 style3 n">3.617</td>
433
+ <td class="column10 style3 n">0.985</td>
434
+ <td class="column11 style3 n">0.967</td>
435
+ <td class="column12 style3 n">0.898</td>
436
+ </tr>
437
+ <tr class="row24">
438
+ <td class="column1 style1 s">average</td>
439
+ <td class="column2 style3 n">0.210</td>
440
+ <td class="column3 style3 n">0.221</td>
441
+ <td class="column4 style3 n">0.171</td>
442
+ <td class="column5 style4 n">0.152</td>
443
+ <td class="column6 style3 n">0.717</td>
444
+ <td class="column7 style3 n">0.494</td>
445
+ <td class="column8 style3 n">0.409</td>
446
+ <td class="column9 style3 n">0.788</td>
447
+ <td class="column10 style3 n">0.449</td>
448
+ <td class="column11 style3 n">0.235</td>
449
+ <td class="column12 style3 n">0.207</td>
450
+ </tr>
451
+ <tr class="row25">
452
+ <td class="column0 style5 s" rowspan="3">Mandarin</td>
453
+ <td class="column1 style1 s">aishell_asr_zh</td>
454
+ <td class="column2 style3 n">0.128</td>
455
+ <td class="column3 style3 n">0.050</td>
456
+ <td class="column4 style3 n">0.058</td>
457
+ <td class="column5 style3 n">0.043</td>
458
+ <td class="column6 style3 n">0.122</td>
459
+ <td class="column7 style3 n">0.028</td>
460
+ <td class="column8 style4 n">0.024</td>
461
+ <td class="column9 style3 n">0.178</td>
462
+ <td class="column10 style3 n">0.931</td>
463
+ <td class="column11 style3 n">0.209</td>
464
+ <td class="column12 style3 n">0.125</td>
465
+ </tr>
466
+ <tr class="row26">
467
+ <td class="column1 style1 s">commonvoice_zh</td>
468
+ <td class="column2 style3 n">0.327</td>
469
+ <td class="column3 style3 n">0.131</td>
470
+ <td class="column4 style3 n">0.147</td>
471
+ <td class="column5 style3 n">0.118</td>
472
+ <td class="column6 style3 n">0.154</td>
473
+ <td class="column7 style3 n">0.113</td>
474
+ <td class="column8 style4 n">0.076</td>
475
+ <td class="column9 style3 n">0.090</td>
476
+ <td class="column10 style3 n">1.001</td>
477
+ <td class="column11 style3 n">0.319</td>
478
+ <td class="column12 style3 n">0.196</td>
479
+ </tr>
480
+ <tr class="row27">
481
+ <td class="column1 style1 s">average</td>
482
+ <td class="column2 style3 n">0.228</td>
483
+ <td class="column3 style3 n">0.091</td>
484
+ <td class="column4 style3 n">0.102</td>
485
+ <td class="column5 style3 n">0.081</td>
486
+ <td class="column6 style3 n">0.138</td>
487
+ <td class="column7 style3 n">0.071</td>
488
+ <td class="column8 style4 n">0.050</td>
489
+ <td class="column9 style3 n">0.134</td>
490
+ <td class="column10 style3 n">0.966</td>
491
+ <td class="column11 style3 n">0.264</td>
492
+ <td class="column12 style3 n">0.160</td>
493
+ </tr>
494
+ <tr class="row28">
495
+ <td class="column0 style5 s" rowspan="10">SEA</td>
496
+ <td class="column1 style1 s">commonvoice_id</td>
497
+ <td class="column2 style3 n">0.260</td>
498
+ <td class="column3 style3 n">0.085</td>
499
+ <td class="column4 style3 n">0.113</td>
500
+ <td class="column5 style3 n">0.079</td>
501
+ <td class="column6 style3 n">1.327</td>
502
+ <td class="column7 style3 n">0.136</td>
503
+ <td class="column8 style3 n">0.110</td>
504
+ <td class="column9 style3 n">0.124</td>
505
+ <td class="column10 style3 n">1.189</td>
506
+ <td class="column11 style3 n">0.100</td>
507
+ <td class="column12 style4 n">0.078</td>
508
+ </tr>
509
+ <tr class="row29">
510
+ <td class="column1 style1 s">commonvoice_ta</td>
511
+ <td class="column2 style3 n">0.528</td>
512
+ <td class="column3 style3 n">0.139</td>
513
+ <td class="column4 style3 n">0.156</td>
514
+ <td class="column5 style4 n">0.129</td>
515
+ <td class="column6 style3 n">1.178</td>
516
+ <td class="column7 style3 n">0.831</td>
517
+ <td class="column8 style3 n">0.847</td>
518
+ <td class="column9 style3 n">1.297</td>
519
+ <td class="column10 style3 n">1.427</td>
520
+ <td class="column11 style3 n">0.238</td>
521
+ <td class="column12 style3 n">0.244</td>
522
+ </tr>
523
+ <tr class="row30">
524
+ <td class="column1 style1 s">commonvoice_th</td>
525
+ <td class="column2 style3 n">0.847</td>
526
+ <td class="column3 style3 n">0.307</td>
527
+ <td class="column4 style3 n">0.466</td>
528
+ <td class="column5 style3 n">0.635</td>
529
+ <td class="column6 style3 n">1.054</td>
530
+ <td class="column7 style3 n">0.113</td>
531
+ <td class="column8 style3 n">0.104</td>
532
+ <td class="column9 style4 n">0.047</td>
533
+ <td class="column10 style3 n">1.044</td>
534
+ <td class="column11 style3 n">0.093</td>
535
+ <td class="column12 style3 n">0.064</td>
536
+ </tr>
537
+ <tr class="row31">
538
+ <td class="column1 style1 s">commonvoice_vi</td>
539
+ <td class="column2 style3 n">0.922</td>
540
+ <td class="column3 style3 n">0.142</td>
541
+ <td class="column4 style3 n">0.156</td>
542
+ <td class="column5 style3 n">0.142</td>
543
+ <td class="column6 style3 n">1.107</td>
544
+ <td class="column7 style3 n">0.196</td>
545
+ <td class="column8 style3 n">0.184</td>
546
+ <td class="column9 style3 n">0.255</td>
547
+ <td class="column10 style3 n">1.496</td>
548
+ <td class="column11 style3 n">0.157</td>
549
+ <td class="column12 style4 n">0.117</td>
550
+ </tr>
551
+ <tr class="row32">
552
+ <td class="column1 style1 s">fleurs_tamil_ta</td>
553
+ <td class="column2 style3 n">0.462</td>
554
+ <td class="column3 style3 n">0.143</td>
555
+ <td class="column4 style3 n">0.161</td>
556
+ <td class="column5 style4 n">0.138</td>
557
+ <td class="column6 style3 n">1.702</td>
558
+ <td class="column7 style3 n">1.654</td>
559
+ <td class="column8 style3 n">0.867</td>
560
+ <td class="column9 style3 n">2.062</td>
561
+ <td class="column10 style3 n">1.508</td>
562
+ <td class="column11 style3 n">0.272</td>
563
+ <td class="column12 style3 n">0.284</td>
564
+ </tr>
565
+ <tr class="row33">
566
+ <td class="column1 style1 s">gigaspeech2_id</td>
567
+ <td class="column2 style3 n">0.337</td>
568
+ <td class="column3 style3 n">0.178</td>
569
+ <td class="column4 style3 n">0.172</td>
570
+ <td class="column5 style4 n">0.163</td>
571
+ <td class="column6 style3 n">5.804</td>
572
+ <td class="column7 style3 n">0.275</td>
573
+ <td class="column8 style3 n">0.227</td>
574
+ <td class="column9 style3 n">0.316</td>
575
+ <td class="column10 style3 n">2.118</td>
576
+ <td class="column11 style3 n">0.219</td>
577
+ <td class="column12 style3 n">0.193</td>
578
+ </tr>
579
+ <tr class="row34">
580
+ <td class="column1 style1 s">gigaspeech2_th</td>
581
+ <td class="column2 style3 n">0.987</td>
582
+ <td class="column3 style3 n">0.200</td>
583
+ <td class="column4 style3 n">0.200</td>
584
+ <td class="column5 style4 n">0.182</td>
585
+ <td class="column6 style3 n">1.734</td>
586
+ <td class="column7 style3 n">0.300</td>
587
+ <td class="column8 style3 n">0.232</td>
588
+ <td class="column9 style3 n">0.209</td>
589
+ <td class="column10 style3 n">1.247</td>
590
+ <td class="column11 style3 n">0.276</td>
591
+ <td class="column12 style3 n">0.209</td>
592
+ </tr>
593
+ <tr class="row35">
594
+ <td class="column1 style1 s">gigaspeech2_vi</td>
595
+ <td class="column2 style3 n">0.982</td>
596
+ <td class="column3 style3 n">0.168</td>
597
+ <td class="column4 style3 n">0.113</td>
598
+ <td class="column5 style4 n">0.095</td>
599
+ <td class="column6 style3 n">2.504</td>
600
+ <td class="column7 style3 n">0.177</td>
601
+ <td class="column8 style3 n">0.227</td>
602
+ <td class="column9 style3 n">0.189</td>
603
+ <td class="column10 style3 n">1.546</td>
604
+ <td class="column11 style3 n">0.171</td>
605
+ <td class="column12 style3 n">0.155</td>
606
+ </tr>
607
+ <tr class="row36">
608
+ <td class="column1 style1 s">lotus_thai_th</td>
609
+ <td class="column2 style3 n">0.852</td>
610
+ <td class="column3 style3 n">0.015</td>
611
+ <td class="column4 style3 n">0.019</td>
612
+ <td class="column5 style4 n">0.011</td>
613
+ <td class="column6 style3 n">1.286</td>
614
+ <td class="column7 style3 n">0.026</td>
615
+ <td class="column8 style3 n">0.021</td>
616
+ <td class="column9 style3 n">0.025</td>
617
+ <td class="column10 style3 n">1.135</td>
618
+ <td class="column11 style3 n">0.068</td>
619
+ <td class="column12 style3 n">0.032</td>
620
+ </tr>
621
+ <tr class="row37">
622
+ <td class="column1 style1 s">average</td>
623
+ <td class="column2 style3 n">0.686</td>
624
+ <td class="column3 style3 n">0.153</td>
625
+ <td class="column4 style3 n">0.173</td>
626
+ <td class="column5 style3 n">0.175</td>
627
+ <td class="column6 style3 n">1.966</td>
628
+ <td class="column7 style3 n">0.412</td>
629
+ <td class="column8 style3 n">0.313</td>
630
+ <td class="column9 style3 n">0.503</td>
631
+ <td class="column10 style3 n">1.412</td>
632
+ <td class="column11 style3 n">0.177</td>
633
+ <td class="column12 style4 n">0.153</td>
634
+ </tr>
635
+ <tr class="row38">
636
+ <td class="column0 style5 s" rowspan="7">Singlish</td>
637
+ <td class="column1 style1 s">imda_part1_asr</td>
638
+ <td class="column2 style4 n">0.043</td>
639
+ <td class="column3 style3 n">0.049</td>
640
+ <td class="column4 style3 n">0.052</td>
641
+ <td class="column5 style3 n">0.044</td>
642
+ <td class="column6 style3 n">0.058</td>
643
+ <td class="column7 style3 n">0.053</td>
644
+ <td class="column8 style3 n">0.053</td>
645
+ <td class="column9 style3 n">0.129</td>
646
+ <td class="column10 style3 n">0.093</td>
647
+ <td class="column11 style3 n">0.071</td>
648
+ <td class="column12 style3 n">0.069</td>
649
+ </tr>
650
+ <tr class="row39">
651
+ <td class="column1 style1 s">imda_part2_asr</td>
652
+ <td class="column2 style4 n">0.047</td>
653
+ <td class="column3 style3 n">0.058</td>
654
+ <td class="column4 style3 n">0.145</td>
655
+ <td class="column5 style3 n">0.054</td>
656
+ <td class="column6 style3 n">0.345</td>
657
+ <td class="column7 style3 n">0.095</td>
658
+ <td class="column8 style3 n">0.094</td>
659
+ <td class="column9 style3 n">0.290</td>
660
+ <td class="column10 style3 n">0.458</td>
661
+ <td class="column11 style3 n">0.330</td>
662
+ <td class="column12 style3 n">0.319</td>
663
+ </tr>
664
+ <tr class="row40">
665
+ <td class="column1 style1 s">imda_part3_30s_asr</td>
666
+ <td class="column2 style3 n">0.213</td>
667
+ <td class="column3 style3 n">0.264</td>
668
+ <td class="column4 style3 n">0.227</td>
669
+ <td class="column5 style4 n">0.196</td>
670
+ <td class="column6 style3 n">0.438</td>
671
+ <td class="column7 style3 n">0.475</td>
672
+ <td class="column8 style3 n">0.535</td>
673
+ <td class="column9 style3 n">1.195</td>
674
+ <td class="column10 style3 n">0.681</td>
675
+ <td class="column11 style3 n">0.281</td>
676
+ <td class="column12 style3 n">0.277</td>
677
+ </tr>
678
+ <tr class="row41">
679
+ <td class="column1 style1 s">imda_part4_30s_asr</td>
680
+ <td class="column2 style3 n">0.297</td>
681
+ <td class="column3 style3 n">0.360</td>
682
+ <td class="column4 style3 n">0.295</td>
683
+ <td class="column5 style4 n">0.246</td>
684
+ <td class="column6 style3 n">1.470</td>
685
+ <td class="column7 style3 n">1.250</td>
686
+ <td class="column8 style3 n">1.303</td>
687
+ <td class="column9 style3 n">1.865</td>
688
+ <td class="column10 style3 n">0.787</td>
689
+ <td class="column11 style3 n">0.459</td>
690
+ <td class="column12 style3 n">0.458</td>
691
+ </tr>
692
+ <tr class="row42">
693
+ <td class="column1 style1 s">imda_part5_30s_asr</td>
694
+ <td class="column2 style3 n">0.154</td>
695
+ <td class="column3 style3 n">0.202</td>
696
+ <td class="column4 style3 n">0.168</td>
697
+ <td class="column5 style4 n">0.140</td>
698
+ <td class="column6 style3 n">0.239</td>
699
+ <td class="column7 style3 n">0.280</td>
700
+ <td class="column8 style3 n">0.374</td>
701
+ <td class="column9 style3 n">0.631</td>
702
+ <td class="column10 style3 n">0.375</td>
703
+ <td class="column11 style3 n">0.218</td>
704
+ <td class="column12 style3 n">0.214</td>
705
+ </tr>
706
+ <tr class="row43">
707
+ <td class="column1 style1 s">imda_part6_30s_asr</td>
708
+ <td class="column2 style3 n">0.109</td>
709
+ <td class="column3 style3 n">0.149</td>
710
+ <td class="column4 style3 n">0.127</td>
711
+ <td class="column5 style4 n">0.099</td>
712
+ <td class="column6 style3 n">0.144</td>
713
+ <td class="column7 style3 n">0.183</td>
714
+ <td class="column8 style3 n">0.275</td>
715
+ <td class="column9 style3 n">0.665</td>
716
+ <td class="column10 style3 n">0.255</td>
717
+ <td class="column11 style3 n">0.175</td>
718
+ <td class="column12 style3 n">0.172</td>
719
+ </tr>
720
+ <tr class="row44">
721
+ <td class="column1 style1 s">average</td>
722
+ <td class="column2 style3 n">0.144</td>
723
+ <td class="column3 style3 n">0.180</td>
724
+ <td class="column4 style3 n">0.169</td>
725
+ <td class="column5 style4 n">0.130</td>
726
+ <td class="column6 style3 n">0.449</td>
727
+ <td class="column7 style3 n">0.389</td>
728
+ <td class="column8 style3 n">0.439</td>
729
+ <td class="column9 style3 n">0.796</td>
730
+ <td class="column10 style3 n">0.441</td>
731
+ <td class="column11 style3 n">0.256</td>
732
+ <td class="column12 style3 n">0.252</td>
733
+ </tr>
734
+ </tbody>
735
+ </table>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
736
 
737
 
738
  ## 🔧 How to Use