kb-whisper-medium does not have the verbosity `subtitle`
I noted that kb-whisper-medium
does not have subtitle
verbosity, is there a special reason for that?
We had trouble with training instability when training the medium
model. Gradient norms kept slowly growing and the loss started diverging after a certain number of steps. We had trouble stabilizing the training for the entire duration no matter which hyperparams we tried.
Some of the other variants of the medium
model also displayed this behavior, but we managed to stabilize them, or picked a slightly earlier checkpoint step to release.
The subtitle
variant diverged early and we didn't have time to try to re-train it.
None of the other model sizes displayed this instability during training, so we were scratching our heads and were really confused as to why the weights of this particular model were sickly and beyond repair.