AudranB commited on
Commit
f84a9d3
·
verified ·
1 Parent(s): 8469512

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -113,8 +113,6 @@ Compared to the base model, this version:
113
  - Does **not** include punctuation or uppercase letters.
114
  - Was trained on **9,500+ hours** of diverse, manually transcribed French speech.
115
 
116
- The training code is available in the [nemo asr training repository](https://github.com/linagora-labs/nemo_asr_training).
117
-
118
  ---
119
 
120
  ## Performance
@@ -174,7 +172,43 @@ asr_model.change_decoding_strategy(decoder_type="ctc")
174
  asr_model.transcribe([audio_path])
175
  ```
176
 
177
- ## Datasets
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
178
 
179
  The data were transformed, processed and converted using [NeMo tools from the SSAK repository](https://github.com/linagora-labs/ssak/tree/main/tools/nemo)
180
 
 
113
  - Does **not** include punctuation or uppercase letters.
114
  - Was trained on **9,500+ hours** of diverse, manually transcribed French speech.
115
 
 
 
116
  ---
117
 
118
  ## Performance
 
172
  asr_model.transcribe([audio_path])
173
  ```
174
 
175
+ ## Training Details
176
+
177
+ The training code is available in the [nemo_asr_training repository](https://github.com/linagora-labs/nemo_asr_training).
178
+ The full configuration used for fine-tuning is available [here](https://github.com/linagora-labs/nemo_asr_training/blob/main/fastconformer/yamls/nvidia_stt_fr_fastconformer_hybrid_large_pc.yaml).
179
+
180
+ ### Hardware
181
+ - 1× NVIDIA H100 GPU (80 GB)
182
+
183
+ ### Training Configuration
184
+ - Precision: BF16 mixed precision
185
+ - Max training steps: 100,000
186
+ - Gradient accumulation: 4 batches
187
+
188
+ ### Tokenizer
189
+ - Type: SentencePiece
190
+ - Vocabulary size: 1,024 tokens
191
+
192
+ ### Optimization
193
+ - Optimizer: `AdamW`
194
+ - Learning rate: `1e-5`
195
+ - Betas: `[0.9, 0.98]`
196
+ - Weight decay: `1e-3`
197
+ - Scheduler: `CosineAnnealing`
198
+ - Warmup steps: 10,000
199
+ - Minimum learning rate: `1e-6`
200
+
201
+ ### Data Setup
202
+ - 6 duration buckets (ranging from 0.1s to 30s)
203
+ - Batch sizes per bucket:
204
+ - Bucket 1 (shortest segments): batch size 80
205
+ - Bucket 2: batch size 76
206
+ - Bucket 3: batch size 72
207
+ - Bucket 4: batch size 68
208
+ - Bucket 5: batch size 64
209
+ - Bucket 6 (longest segments): batch size 60
210
+
211
+ ### Training datasets
212
 
213
  The data were transformed, processed and converted using [NeMo tools from the SSAK repository](https://github.com/linagora-labs/ssak/tree/main/tools/nemo)
214