Why is whisper losing part of text at the end of audio I send it?

#24
by wangleineo - opened

I used whisper.cpp to capture audio speech and send to whisper model, in segments of around 20 seconds (or when vad marks a pause).
I found at the end of almost every clip, some text is missing from the recognition result: sometimes a few words, sometimes an entire sentence.
I am sure the audio data are intact, because I record the data I send to whisper.cpp, no segment is missing.
I have tried ggml-small.en.bin and the medium model, same problem. I tried inferencing with GPU and CPU.

Do you see the same problem? How do you resolve it?

wangleineo changed discussion status to closed

I don't know the problem I am just now using Wisper CPP and it is just giving very good transcriptions including this text which I am now typing. - I did not type it, I was just talking.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment