Is it possible to use "prompt" or "hotwords" to steer decoding similar to Whisper?
#8
by
spashii
- opened
^title
It should be possible to do with a corpus at least: https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/asr/asr_customization/ngpulm_language_modeling_and_customization.html#ngpulm-ngram-modeling
Training the n-gram model is really fast.
But any time I add it and try to transcribe on a sound file that's more than 40 seconds long (still less than a minute) it will drop a bunch of sentences.
And then they have word boosting which doesn't seem to work on an AER model like canary