Diarization
Hello,
Thanks for providing! Are there any plans to support diarzation (speaker change recognition)?
I tried the latest GGML checkpoint with Whisper.cpp and it has a flag -tdrz
which should diarize, but I think the model needs to support it.
would also be very interested in this!! (Y)
Hi
We will not support diarization as part of the Whisper model's inherent functionality. As far as I can see the support in Whisper.cpp is quite experimental and limited to only the English version of whisper.small.
If you want to run our models with diarization I recommend a multi step pipeline. WhisperX's README has an example how to achieve this with the WhisperX library. Load our Whisper and Wav2vec2 as the first 2 steps (transcription and alignment), and then run the diarization step as described in WhisperX.