espnet 's Collections

OWLS: Scaling Laws for Speech Recognition and Translation

A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data.