BELLE-2/Belle-whisper-large-v3-zh · why is the model so large?

Thanks for your question!

Actually, the original Whisper-large-v3 model in FP32 precision is around 6GB, not 1.5GB. The size you mentioned (1.5GB) might be referring to a quantized version, such as FP16 or INT8.

When fine-tuning models like Whisper, we usually work with the full-precision version (FP32), which explains the larger file size.

If you're concerned about VRAM usage during deployment or inference:

✅ You can use lower precision versions, such as:

FP16: ~3GB
INT8: ~1.5GB
Libraries like faster-whisper provide support for loading and running these quantized models efficiently.