why is the model so large?
#12
by
WangBicheng
- opened
The original whisper-large-v3 model size is about 1.5GB.
Why is the finetune model so large?
Does it means we need more VRAM for depolying it?
Thanks for your question!
Actually, the original Whisper-large-v3 model in FP32 precision is around 6GB, not 1.5GB. The size you mentioned (1.5GB) might be referring to a quantized version, such as FP16 or INT8.
When fine-tuning models like Whisper, we usually work with the full-precision version (FP32), which explains the larger file size.
If you're concerned about VRAM usage during deployment or inference:
â You can use lower precision versions, such as:
FP16: ~3GB
INT8: ~1.5GB
Libraries like faster-whisper provide support for loading and running these quantized models efficiently.