faster-whisper-large-v3-int8-ct2

This repository contains the OpenAI Whisper Large v3 model converted to the CTranslate2 format with int8 quantization.

This conversion makes the model significantly faster and more memory-efficient for inference, with a minimal trade-off in accuracy. CTranslate2 is an inference engine for Transformer models developed by OpenNMT.

Model Details

  • Base Model: openai/whisper-large-v3
  • Format: CTranslate2
  • Quantization: int8

How to Use

You can use this model with the faster-whisper library.

First, install faster-whisper:

pip install faster-whisper

Then, you can use the model in your Python code:

from faster_whisper import WhisperModel

model_path = "groxaxo/faster-whisper-large-v3-int8-ct2"

# Run on GPU with FP16
# model = WhisperModel(model_path, device="cuda", compute_type="float16")

# or run on GPU with INT8
# model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")

# or run on CPU with INT8
model = WhisperModel(model_path, device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Replace "audio.mp3" with the path to your audio file.

Conversion

The model was converted using the ct2-transformers-converter tool from the CTranslate2 project.

ct2-transformers-converter --model openai/whisper-large-v3 --output_dir faster-whisper-large-v3-int8-ct2 --quantization int8 --copy_files tokenizer.json preprocessor_config.json

Disclaimer

Quantization can have a small impact on the model's accuracy. While int8 quantization is generally safe and provides a good balance of performance and accuracy, you should evaluate the model on your specific task to ensure it meets your requirements.

Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for groxaxo/faster-whisper-large-v3-int8-ct2

Finetuned
(632)
this model