Lite-Whisper
Collection
https://github.com/efeslab/LiteASR
•
7 items
•
Updated
•
1
Lite-Whisper is a compressed version of OpenAI Whisper with LiteASR. See our GitHub repository and paper for details.
Following is the average word error rate (WER) evaluated on the ESB datasets:
Model | Average WER (↓) | Encoder Size | Decoder Size |
---|---|---|---|
whisper-large-v3 | 10.1 | 635M | 907M |
lite-whisper-large-v3-acc | 10.1 | 429M | 907M |
lite-whisper-large-v3 | 10.2 | 377M | 907M |
lite-whisper-large-v3-fast | 11.3 | 308M | 907M |
whisper-large-v3-turbo | 10.1 | 635M | 172M |
lite-whisper-large-v3-turbo-acc | 10.2 | 421M | 172M |
lite-whisper-large-v3-turbo | 12.6 | 374M | 172M |
lite-whisper-large-v3-turbo-fast | 20.1 | 313M | 172M |
whisper-medium | 14.8 | 306M | 457M |
The easiest way to run our model is to use our integration with HuggingFace Transformers library. We provide model weights for the compressed version of OpenAI Whisper series here.
import librosa
import torch
from transformers import AutoProcessor, AutoModel
device = "cuda:0"
dtype = torch.float16
# load the compressed Whisper model
model = AutoModel.from_pretrained(
"efficient-speech/lite-whisper-large-v3-turbo",
trust_remote_code=True,
)
model.to(dtype).to(device)
# we use the same processor as the original model
processor = AutoProcessor.from_pretrained("openai/whisper-large-v3")
# set the path to your audio file
path = "path/to/audio.wav"
audio, _ = librosa.load(path, sr=16000)
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
input_features = input_features.to(dtype).to(device)
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(
predicted_ids,
skip_special_tokens=True
)[0]
print(transcription)
If you use LiteASR in your research, please cite the following paper:
@misc{kamahori2025liteasrefficientautomaticspeech,
title={LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation},
author={Keisuke Kamahori and Jungo Kasai and Noriyuki Kojima and Baris Kasikci},
year={2025},
eprint={2502.20583},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2502.20583},
}