faster-whisper-large-v3-int8-ct2

This repository contains the OpenAI Whisper Large v3 model converted to the CTranslate2 format with int8 quantization.

This conversion makes the model significantly faster and more memory-efficient for inference, with a minimal trade-off in accuracy. CTranslate2 is an inference engine for Transformer models developed by OpenNMT.

Model Details

Base Model: openai/whisper-large-v3
Format: CTranslate2
Quantization: int8

How to Use

You can use this model with the faster-whisper library.

First, install faster-whisper:

pip install faster-whisper

Then, you can use the model in your Python code:

from faster_whisper import WhisperModel

model_path = "groxaxo/faster-whisper-large-v3-int8-ct2"

# Run on GPU with FP16
# model = WhisperModel(model_path, device="cuda", compute_type="float16")

# or run on GPU with INT8
# model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")

# or run on CPU with INT8
model = WhisperModel(model_path, device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Replace "audio.mp3" with the path to your audio file.

Conversion

The model was converted using the ct2-transformers-converter tool from the CTranslate2 project.

ct2-transformers-converter --model openai/whisper-large-v3 --output_dir faster-whisper-large-v3-int8-ct2 --quantization int8 --copy_files tokenizer.json preprocessor_config.json

Disclaimer

Quantization can have a small impact on the model's accuracy. While int8 quantization is generally safe and provides a good balance of performance and accuracy, you should evaluate the model on your specific task to ensure it meets your requirements.

Downloads last month: 15

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for groxaxo/faster-whisper-large-v3-int8-ct2

Base model

openai/whisper-large-v3

Finetuned

(632)

this model