drepic commited on
Commit
5791b5c
·
verified ·
1 Parent(s): d163045

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -105
README.md CHANGED
@@ -1,106 +1,109 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - ja
5
- metrics:
6
- - cer
7
- - wer
8
- base_model:
9
- - openai/whisper-medium
10
- tags:
11
- - ctranslate2
12
- - faster-whisper
13
- - whisper
14
- model-index:
15
- - name: whisper-medium-jp-ct2
16
- results:
17
- - task:
18
- name: Automatic Speech Recognition
19
- type: automatic-speech-recognition
20
- dataset:
21
- name: mozilla-foundation/common_voice_17_0 (ja)
22
- type: mozilla-foundation/common_voice_17_0
23
- config: ja
24
- split: test
25
- args:
26
- language: ja
27
- metrics:
28
- - name: CER
29
- type: cer
30
- value: 0.18572446886192148
31
- ---
32
-
33
- > **This repository contains the CTranslate2 export of the fine-tuned model.**
34
- >
35
- > • Base Transformers model: [drepic/whisper-medium-jp](https://huggingface.co/drepic/whisper-medium-jp)
36
- > • Use with `faster-whisper`:
37
- >
38
- > ```python
39
- > from faster_whisper import WhisperModel
40
- > model = WhisperModel("drepic/whisper-medium-jp-ct2", device="cuda", compute_type="float16")
41
- > ```
42
-
43
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
44
- should probably proofread and complete it, then remove this comment. -->
45
-
46
- # whisper-medium-jp
47
-
48
- This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on an unknown dataset.
49
- It achieves the following results on the evaluation set:
50
- - Loss: 0.4828
51
- - Wer: 0.2254
52
- - Cer: 0.2254
53
-
54
- ## Model description
55
-
56
- Better suited for transcribing japanese youtube content.
57
-
58
- ## Intended uses & limitations
59
-
60
- More information needed
61
-
62
- ## Training and evaluation data
63
-
64
- More information needed
65
-
66
- ## Training procedure
67
-
68
- ### Training hyperparameters
69
-
70
- The following hyperparameters were used during training:
71
- - learning_rate: 4e-06
72
- - train_batch_size: 4
73
- - eval_batch_size: 2
74
- - seed: 42
75
- - distributed_type: multi-GPU
76
- - num_devices: 2
77
- - gradient_accumulation_steps: 2
78
- - total_train_batch_size: 16
79
- - total_eval_batch_size: 4
80
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
81
- - lr_scheduler_type: linear
82
- - lr_scheduler_warmup_steps: 400
83
- - num_epochs: 15
84
- - mixed_precision_training: Native AMP
85
-
86
- ### Training results
87
-
88
- | Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
89
- |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
90
- | 0.5341 | 1.0 | 7155 | 0.5321 | 0.2416 | 0.2416 |
91
- | 0.5023 | 2.0 | 14310 | 0.5143 | 0.2369 | 0.2369 |
92
- | 0.499 | 3.0 | 21465 | 0.5063 | 0.2337 | 0.2337 |
93
- | 0.4773 | 4.0 | 28620 | 0.5010 | 0.2310 | 0.2310 |
94
- | 0.4775 | 5.0 | 35775 | 0.4944 | 0.2289 | 0.2289 |
95
- | 0.4709 | 6.0 | 42930 | 0.4886 | 0.2288 | 0.2288 |
96
- | 0.4907 | 7.0 | 50085 | 0.4870 | 0.2271 | 0.2271 |
97
- | 0.4855 | 8.0 | 57240 | 0.4868 | 0.2261 | 0.2261 |
98
- | 0.4487 | 9.0 | 64395 | 0.4828 | 0.2254 | 0.2254 |
99
-
100
-
101
- ### Framework versions
102
-
103
- - Transformers 4.56.1
104
- - Pytorch 2.8.0+cu128
105
- - Datasets 4.0.0
 
 
 
106
  - Tokenizers 0.22.0
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ja
5
+ metrics:
6
+ - cer
7
+ - wer
8
+ base_model:
9
+ - openai/whisper-medium
10
+ tags:
11
+ - ctranslate2
12
+ - faster-whisper
13
+ - whisper
14
+ model-index:
15
+ - name: whisper-medium-jp-ct2
16
+ results:
17
+ - task:
18
+ name: Automatic Speech Recognition
19
+ type: automatic-speech-recognition
20
+ dataset:
21
+ name: mozilla-foundation/common_voice_17_0 (ja)
22
+ type: mozilla-foundation/common_voice_17_0
23
+ config: ja
24
+ split: test
25
+ args:
26
+ language: ja
27
+ metrics:
28
+ - name: CER
29
+ type: cer
30
+ value: 0.18572446886192148
31
+ ---
32
+
33
+ > **This repository contains the CTranslate2 export of the fine-tuned model.**
34
+ >
35
+ > • Base Transformers model: [drepic/whisper-medium-jp](https://huggingface.co/drepic/whisper-medium-jp)
36
+ > • Use with `faster-whisper`:
37
+ >
38
+ > ```python
39
+ > from faster_whisper import WhisperModel
40
+ > model = WhisperModel("drepic/whisper-medium-jp-ct2", device="cuda", compute_type="float16")
41
+ > ```
42
+
43
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
44
+ should probably proofread and complete it, then remove this comment. -->
45
+
46
+ # OTHER FINETUNES
47
+ - Want something more lightweight? Try [drepic/whisper-small-jp-ct2](https://huggingface.co/drepic/whisper-small-jp-ct2)
48
+
49
+ # whisper-medium-jp
50
+
51
+ This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on an unknown dataset.
52
+ It achieves the following results on the evaluation set:
53
+ - Loss: 0.4828
54
+ - Wer: 0.2254
55
+ - Cer: 0.2254
56
+
57
+ ## Model description
58
+
59
+ Better suited for transcribing japanese youtube content.
60
+
61
+ ## Intended uses & limitations
62
+
63
+ More information needed
64
+
65
+ ## Training and evaluation data
66
+
67
+ More information needed
68
+
69
+ ## Training procedure
70
+
71
+ ### Training hyperparameters
72
+
73
+ The following hyperparameters were used during training:
74
+ - learning_rate: 4e-06
75
+ - train_batch_size: 4
76
+ - eval_batch_size: 2
77
+ - seed: 42
78
+ - distributed_type: multi-GPU
79
+ - num_devices: 2
80
+ - gradient_accumulation_steps: 2
81
+ - total_train_batch_size: 16
82
+ - total_eval_batch_size: 4
83
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
84
+ - lr_scheduler_type: linear
85
+ - lr_scheduler_warmup_steps: 400
86
+ - num_epochs: 15
87
+ - mixed_precision_training: Native AMP
88
+
89
+ ### Training results
90
+
91
+ | Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
92
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
93
+ | 0.5341 | 1.0 | 7155 | 0.5321 | 0.2416 | 0.2416 |
94
+ | 0.5023 | 2.0 | 14310 | 0.5143 | 0.2369 | 0.2369 |
95
+ | 0.499 | 3.0 | 21465 | 0.5063 | 0.2337 | 0.2337 |
96
+ | 0.4773 | 4.0 | 28620 | 0.5010 | 0.2310 | 0.2310 |
97
+ | 0.4775 | 5.0 | 35775 | 0.4944 | 0.2289 | 0.2289 |
98
+ | 0.4709 | 6.0 | 42930 | 0.4886 | 0.2288 | 0.2288 |
99
+ | 0.4907 | 7.0 | 50085 | 0.4870 | 0.2271 | 0.2271 |
100
+ | 0.4855 | 8.0 | 57240 | 0.4868 | 0.2261 | 0.2261 |
101
+ | 0.4487 | 9.0 | 64395 | 0.4828 | 0.2254 | 0.2254 |
102
+
103
+
104
+ ### Framework versions
105
+
106
+ - Transformers 4.56.1
107
+ - Pytorch 2.8.0+cu128
108
+ - Datasets 4.0.0
109
  - Tokenizers 0.22.0