Diarization

#7
by cospaia - opened

Hello,

Thanks for providing! Are there any plans to support diarzation (speaker change recognition)?

I tried the latest GGML checkpoint with Whisper.cpp and it has a flag -tdrz which should diarize, but I think the model needs to support it.

would also be very interested in this!! (Y)

National Library of Sweden / KBLab org
edited 10 days ago

Hi

We will not support diarization as part of the Whisper model's inherent functionality. As far as I can see the support in Whisper.cpp is quite experimental and limited to only the English version of whisper.small.

If you want to run our models with diarization I recommend a multi step pipeline. WhisperX's README has an example how to achieve this with the WhisperX library. Load our Whisper and Wav2vec2 as the first 2 steps (transcription and alignment), and then run the diarization step as described in WhisperX.

Lauler changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment