[FEAT][EASY] Add Voice Activity Detection (VAD) to improve overall WER

#1
by mfuntowicz - opened
Inference Endpoints Images org

Currently the endpoint assumes a naive, 30s window, chunking strategy. Adding a smarter way to split chunk on silent would certainly signficantly improve WER score.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment