Automatic Speech Recognition
Malayalam
ctranslate2
audio
vegam
kurianbenoy commited on
Commit
7c14177
1 Parent(s): a31dd7a

add Readme

Browse files
Files changed (1) hide show
  1. README.md +72 -2
README.md CHANGED
@@ -18,12 +18,82 @@ This is a conversion of [thennal/whisper-medium-ml](https://huggingface.co/thenn
18
 
19
  This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/guillaumekln/faster-whisper).
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Example
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ## Conversion Details
24
 
25
- The original model was converted with the following command:
26
 
27
  ```
28
  ct2-transformers-converter --model thennal/whisper-medium-ml --output_dir vegam-whisper-medium-ml
29
- ```
 
 
 
 
 
 
 
 
18
 
19
  This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/guillaumekln/faster-whisper).
20
 
21
+ ## Installation
22
+
23
+ - Install [faster-whisper](https://github.com/guillaumekln/faster-whisper). More details about installation can be [found here in faster-whisper](https://github.com/guillaumekln/faster-whisper/tree/master#installation).
24
+
25
+ ```
26
+ pip install faster-whisper
27
+ ```
28
+
29
+ - Install [git-lfs](https://git-lfs.com/) for using this project. Note that git-lfs is just for downloading model from hugging-face.
30
+
31
+ ```
32
+ apt-get install git-lfs
33
+ ```
34
+
35
+ - Download the model weights
36
+
37
+ ```
38
+ git lfs install
39
+ git clone https://huggingface.co/kurianbenoy/vegam-whisper-medium-ml
40
+ ```
41
+
42
+ ## Usage
43
+
44
+ ```
45
+ from faster_whisper import WhisperModel
46
+
47
+ model_path = "vegam-whisper-medium-ml"
48
+
49
+ # Run on GPU with FP16
50
+ model = WhisperModel(model_path, device="cuda", compute_type="float16")
51
+
52
+ # or run on GPU with INT8
53
+ # model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")
54
+ # or run on CPU with INT8
55
+ # model = WhisperModel(model_path, device="cpu", compute_type="int8")
56
+
57
+ segments, info = model.transcribe("audio.mp3", beam_size=5)
58
+
59
+ print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
60
+
61
+ for segment in segments:
62
+ print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
63
+ ```
64
+
65
  ## Example
66
 
67
+ ```
68
+ from faster_whisper import WhisperModel
69
+
70
+ model_path = "vegam-whisper-medium-ml"
71
+
72
+ model = WhisperModel(model_path, device="cuda", compute_type="float16")
73
+
74
+
75
+ segments, info = model.transcribe("00b38e80-80b8-4f70-babf-566e848879fc.webm", beam_size=5)
76
+
77
+ print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
78
+
79
+ for segment in segments:
80
+ print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
81
+ ```
82
+
83
+ > Detected language 'ta' with probability 0.353516
84
+ > [0.00s -> 4.74s] പാലം കടുക്കുവോളം നാരായണ പാലം കടന്നാലൊ കൂരായണ
85
+
86
  ## Conversion Details
87
 
88
+ This conversion was possible with wonderful [CTranslate2 library](https://github.com/OpenNMT/CTranslate2) leveraging the [Transformers converter for OpenAI Whisper](https://opennmt.net/CTranslate2/guides/transformers.html#whisper).The original model was converted with the following command:
89
 
90
  ```
91
  ct2-transformers-converter --model thennal/whisper-medium-ml --output_dir vegam-whisper-medium-ml
92
+ ```
93
+
94
+ ## Many Thanks to
95
+
96
+ - Creators of CTranslate2 and faster-whisper
97
+ - Thennal D K
98
+ - Santhosh Thottingal
99
+