whisperkittools generated README.md
Browse files
README.md
CHANGED
|
@@ -17,34 +17,34 @@ tags:
|
|
| 17 |
## Dataset: `librispeech`
|
| 18 |
Short-form Audio (<30s/clip) - 5 hours of English audiobook clips
|
| 19 |
|
| 20 |
-
| | WER (β) | QoI (β) | File Size (MB) |
|
| 21 |
-
|
| 22 |
-
| WhisperOpenAIAPI/openai_whisper-large-v2 | [2.35](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech) | 100 | 3100 |
|
| 23 |
-
| [WhisperKit/openai_whisper-large-v3](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3) | [2.04](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3/librispeech) | 95.2 | 3100 |
|
| 24 |
-
| [WhisperKit/openai_whisper-large-v3_turbo](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3_turbo) | [2.03](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3_turbo/librispeech) | 95.4 | 3100 |
|
| 25 |
-
| [WhisperKit/openai_whisper-large-v3_turbo_1018MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3_turbo_1018MB) | [1.99](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3_turbo_1018MB/librispeech) | 94.8 | 1018 |
|
| 26 |
-
| [WhisperKit/openai_whisper-large-v2](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2) | [2.77](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2/librispeech) | 96.6 | 3100 |
|
| 27 |
-
| [WhisperKit/openai_whisper-large-v2_1050MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_1050MB) | [2.81](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_1050MB/librispeech) | 95 | 1050 |
|
| 28 |
-
| [WhisperKit/openai_whisper-large-v2_turbo](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_turbo) | [2.76](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_turbo/librispeech) | 96.6 | 3100 |
|
| 29 |
-
| [WhisperKit/openai_whisper-large-v2_turbo_1022MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_turbo_1022MB) | [2.66](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_turbo_1022MB/librispeech) | 94.9 | 1022 |
|
| 30 |
-
| [WhisperKit/openai_whisper-small.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-small.en) | [3.12](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-small.en/librispeech) | 85.8 | 483 |
|
| 31 |
-
| [WhisperKit/openai_whisper-small](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-small) | [3.45](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-small/librispeech) | 83 | 483 |
|
| 32 |
-
| [WhisperKit/openai_whisper-base.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base.en) | [3.98](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base.en/librispeech) | 75.3 | 145 |
|
| 33 |
-
| [WhisperKit/openai_whisper-base](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base) | [4.97](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base/librispeech) | 67.2 | 145 |
|
| 34 |
-
| [WhisperKit/openai_whisper-tiny.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny.en) | [5.61](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny.en/librispeech) | 63.9 | 66 |
|
| 35 |
-
| [WhisperKit/openai_whisper-tiny](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny) | [7.47](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny/librispeech) | 52.5 | 66 |
|
| 36 |
-
| whisper.cpp/openai_whisper-large-v3 | [1.97](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/whisper.cpp/openai_whisper-large-v3/librispeech) | 95.4 | 3100 |
|
| 37 |
|
| 38 |
## Dataset: `earnings22`
|
| 39 |
Long-Form Audio (>1hr/clip) - 120 hours of earnings call recordings in English with various accents
|
| 40 |
|
| 41 |
-
| | WER (β) | QoI (β) | File Size (MB) |
|
| 42 |
-
|
| 43 |
-
| WhisperOpenAIAPI/openai_whisper-large-v2 | [16.27](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22) | 100 | 3100 |
|
| 44 |
-
| [WhisperKit/openai_whisper-large-v3](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3) | [15.17](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3/earnings22) | 58.5 | 3100 |
|
| 45 |
-
| [WhisperKit/openai_whisper-base.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base.en) | [23.49](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base.en/earnings22) | 6.5 | 145 |
|
| 46 |
-
| [WhisperKit/openai_whisper-tiny.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny.en) | [28.64](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny.en/earnings22) | 5.7 | 66 |
|
| 47 |
-
| whisper.cpp/openai_whisper-large-v3 | [33.58](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/whisper.cpp/openai_whisper-large-v3/earnings22) | 6.5 | 3100 |
|
| 48 |
|
| 49 |
|
| 50 |
We believe that rigorously measuring the quality of inference is necessary for developers and
|
|
|
|
| 17 |
## Dataset: `librispeech`
|
| 18 |
Short-form Audio (<30s/clip) - 5 hours of English audiobook clips
|
| 19 |
|
| 20 |
+
| | WER (β) | QoI (β) | File Size (MB) | Code Commit |
|
| 21 |
+
|:--------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------|----------:|-----------------:|:--------------|
|
| 22 |
+
| WhisperOpenAIAPI/openai_whisper-large-v2 | [2.35](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech) | 100 | 3100 | N/A |
|
| 23 |
+
| [WhisperKit/openai_whisper-large-v3](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3) | [2.04](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3/librispeech) | 95.2 | 3100 | 2846fd9 |
|
| 24 |
+
| [WhisperKit/openai_whisper-large-v3_turbo](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3_turbo) | [2.03](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3_turbo/librispeech) | 95.4 | 3100 | 2846fd9 |
|
| 25 |
+
| [WhisperKit/openai_whisper-large-v3_turbo_1018MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3_turbo_1018MB) | [1.99](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3_turbo_1018MB/librispeech) | 94.8 | 1018 | 2846fd9 |
|
| 26 |
+
| [WhisperKit/openai_whisper-large-v2](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2) | [2.77](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2/librispeech) | 96.6 | 3100 | 2846fd9 |
|
| 27 |
+
| [WhisperKit/openai_whisper-large-v2_1050MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_1050MB) | [2.81](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_1050MB/librispeech) | 95 | 1050 | 2846fd9 |
|
| 28 |
+
| [WhisperKit/openai_whisper-large-v2_turbo](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_turbo) | [2.76](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_turbo/librispeech) | 96.6 | 3100 | 2846fd9 |
|
| 29 |
+
| [WhisperKit/openai_whisper-large-v2_turbo_1022MB](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v2_turbo_1022MB) | [2.66](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v2_turbo_1022MB/librispeech) | 94.9 | 1022 | 2846fd9 |
|
| 30 |
+
| [WhisperKit/openai_whisper-small.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-small.en) | [3.12](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-small.en/librispeech) | 85.8 | 483 | 228630c |
|
| 31 |
+
| [WhisperKit/openai_whisper-small](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-small) | [3.45](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-small/librispeech) | 83 | 483 | 228630c |
|
| 32 |
+
| [WhisperKit/openai_whisper-base.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base.en) | [3.98](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base.en/librispeech) | 75.3 | 145 | 228630c |
|
| 33 |
+
| [WhisperKit/openai_whisper-base](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base) | [4.97](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base/librispeech) | 67.2 | 145 | 228630c |
|
| 34 |
+
| [WhisperKit/openai_whisper-tiny.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny.en) | [5.61](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny.en/librispeech) | 63.9 | 66 | 228630c |
|
| 35 |
+
| [WhisperKit/openai_whisper-tiny](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny) | [7.47](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny/librispeech) | 52.5 | 66 | 228630c |
|
| 36 |
+
| whisper.cpp/openai_whisper-large-v3 | [1.97](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/whisper.cpp/openai_whisper-large-v3/librispeech) | 95.4 | 3100 | 25d313b |
|
| 37 |
|
| 38 |
## Dataset: `earnings22`
|
| 39 |
Long-Form Audio (>1hr/clip) - 120 hours of earnings call recordings in English with various accents
|
| 40 |
|
| 41 |
+
| | WER (β) | QoI (β) | File Size (MB) | Code Commit |
|
| 42 |
+
|:------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|----------:|-----------------:|:--------------|
|
| 43 |
+
| WhisperOpenAIAPI/openai_whisper-large-v2 | [16.27](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22) | 100 | 3100 | N/A |
|
| 44 |
+
| [WhisperKit/openai_whisper-large-v3](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-large-v3) | [15.17](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-large-v3/earnings22) | 58.5 | 3100 | 2846fd9 |
|
| 45 |
+
| [WhisperKit/openai_whisper-base.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-base.en) | [23.49](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-base.en/earnings22) | 6.5 | 145 | dda6571 |
|
| 46 |
+
| [WhisperKit/openai_whisper-tiny.en](https://hf.co/argmaxinc/whisperkit-coreml/tree/main/openai_whisper-tiny.en) | [28.64](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/openai_whisper-tiny.en/earnings22) | 5.7 | 66 | dda6571 |
|
| 47 |
+
| whisper.cpp/openai_whisper-large-v3 | [33.58](https://hf.co/datasets/argmaxinc/whisperkit-evals/tree/main/whisper.cpp/openai_whisper-large-v3/earnings22) | 6.5 | 3100 | 25d313b |
|
| 48 |
|
| 49 |
|
| 50 |
We believe that rigorously measuring the quality of inference is necessary for developers and
|