Spaces:
Running
on
Zero
transciption problem
Noted, thanks for feedback.
Temporarily you could segment audio in to 10 minutes each and transcribe, while we work on a fix for this.
This should be fixed now with space. Could you re-run and check. And also updated the space to support transcrption of audios upto 3hrs long
It doesn’t seem to occur on all long-duration segments, and in our samples, the issue appears resolved. Unlikely but is it possible for you share a sample to test.
I gave you a link to the sample , with which there are problems (glued text), but further you have not corrected anything. We are waiting.
This issue occurs only with a very few files. While we understand the cause, I recommend using the chunking method for audios longer than 10 minutes with this script: speech_to_text_buffered_infer_rnnt.py. This should resolve the attention problem. Use large chunk_len and buffer_length to minimize overlap. We also identified a merging issue; the fix is here: PR #13500.