smcproject
/

vegam-whisper-medium-ml

Automatic Speech Recognition

Model card Files Files and versions Community

kurianbenoy commited on May 20, 2023

Commit

7c14177

•

1 Parent(s): a31dd7a

add Readme

Files changed (1) hide show

README.md +72 -2

README.md CHANGED Viewed

@@ -18,12 +18,82 @@ This is a conversion of [thennal/whisper-medium-ml](https://huggingface.co/thenn
 This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/guillaumekln/faster-whisper).
 ## Example
 ## Conversion Details
-The original model was converted with the following command:
 ```
 ct2-transformers-converter --model thennal/whisper-medium-ml --output_dir vegam-whisper-medium-ml
-```

 This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/guillaumekln/faster-whisper).
+## Installation
+- Install [faster-whisper](https://github.com/guillaumekln/faster-whisper). More details about installation can be [found here in faster-whisper](https://github.com/guillaumekln/faster-whisper/tree/master#installation).
+```
+pip install faster-whisper
+```
+- Install  [git-lfs](https://git-lfs.com/) for using this project. Note that git-lfs is just for downloading model from hugging-face.
+```
+apt-get install git-lfs
+```
+- Download the model weights
+```
+git lfs install
+git clone https://huggingface.co/kurianbenoy/vegam-whisper-medium-ml
+```
+## Usage
+```
+from faster_whisper import WhisperModel
+model_path = "vegam-whisper-medium-ml"
+# Run on GPU with FP16
+model = WhisperModel(model_path, device="cuda", compute_type="float16")
+# or run on GPU with INT8
+# model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")
+# or run on CPU with INT8
+# model = WhisperModel(model_path, device="cpu", compute_type="int8")
+segments, info = model.transcribe("audio.mp3", beam_size=5)
+print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
+for segment in segments:
+    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
+```
 ## Example
+```
+from faster_whisper import WhisperModel
+model_path = "vegam-whisper-medium-ml"
+model = WhisperModel(model_path, device="cuda", compute_type="float16")
+segments, info = model.transcribe("00b38e80-80b8-4f70-babf-566e848879fc.webm", beam_size=5)
+print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
+for segment in segments:
+    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
+```
+> Detected language 'ta' with probability 0.353516
+> [0.00s -> 4.74s] പാലം കടുക്കുവോളം നാരായണ പാലം കടന്നാലൊ കൂരായണ
 ## Conversion Details
+This conversion was possible with wonderful [CTranslate2 library](https://github.com/OpenNMT/CTranslate2) leveraging the [Transformers converter for OpenAI Whisper](https://opennmt.net/CTranslate2/guides/transformers.html#whisper).The original model was converted with the following command:
 ```
 ct2-transformers-converter --model thennal/whisper-medium-ml --output_dir vegam-whisper-medium-ml
+```
+## Many Thanks to
+- Creators of CTranslate2 and faster-whisper
+- Thennal D K
+- Santhosh Thottingal