Flutra commited on
Commit
a46a156
·
verified ·
1 Parent(s): bf40ff7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Whisper Large v3 Turbo (Albanian Fine-Tuned) - v2
2
+
3
+ This is a fine-tuned version of the Whisper Large v3 Turbo model, optimized for Albanian speech-to-text transcription. It achieves a Word Error Rate (WER) of **6.98%** on a held-out evaluation set.
4
+
5
+ ## Model Details
6
+ - **Base Model**: `openai/whisper-large-v3-turbo`
7
+ - **Language**: Albanian (`sq`)
8
+
9
+ ## Training Dataset
10
+ - **Source**: Mozilla Common Voice version 19 (available in HG as Kushtrim/common_voice_19_sq)
11
+ - **Description**: Audio clips ranging from 5-30 seconds, in spoken Albanian.
12
+
13
+ ## Training Details
14
+ The model was fine-tuned on an NVIDIA A100 GPU (40GB) using the `transformers` library. Below are the key training arguments:
15
+
16
+ | Argument | Value | Description |
17
+ |----------------------------|---------------|--------------------------------------------------|
18
+ | `per_device_train_batch_size` | 8 | Training batch size per GPU |
19
+ | `per_device_eval_batch_size` | 2 | Evaluation batch size per GPU |
20
+ | `gradient_accumulation_steps` | 1 | Steps to accumulate gradients (effective batch size = 8) |
21
+ | `num_train_epochs` | 3 | Number of training epochs |
22
+ | `learning_rate` | 1e-5 | Initial learning rate |
23
+ | `warmup_steps` | 300 | Number of warmup steps for learning rate |
24
+ | `evaluation_strategy` | "steps" | Evaluate every `eval_steps` during training |
25
+ | `eval_steps` | 250 | Frequency of evaluation (every 250 steps) |
26
+ | `fp16` | True | Use mixed precision training (16-bit floats) |
27
+
28
+ - Total Steps: ~3,540 (completed 3,500)
29
+ - Hardware: NVIDIA A100 (40GB)
30
+ - Libraries:
31
+ - transformers==4.38.2
32
+ - torch==2.2.1
33
+
34
+ ## Performance
35
+
36
+ | Step | Training Loss | Validation Loss | WER |
37
+ |-------|---------------|-----------------|--------|
38
+ | 250 | 0.4744 | 0.3991 | 34.03% |
39
+ | 500 | 0.3421 | 0.3426 | 30.42% |
40
+ | 750 | 0.2871 | 0.2808 | 26.09% |
41
+ | 1000 | 0.2401 | 0.2258 | 21.31% |
42
+ | 1250 | 0.1809 | 0.1998 | 19.15% |
43
+ | 1500 | 0.1142 | 0.1827 | 17.33% |
44
+ | 1750 | 0.1051 | 0.1611 | 15.19% |
45
+ | 2000 | 0.0930 | 0.1464 | 13.82% |
46
+ | 2250 | 0.0827 | 0.1313 | 11.79% |
47
+ | 2500 | 0.0420 | 0.1139 | 10.50% |
48
+ | 2750 | 0.0330 | 0.1124 | 9.58% |
49
+ | 3000 | 0.0255 | 0.1006 | 8.38% |
50
+ | 3250 | 0.0256 | 0.0905 | 7.48% |
51
+ | 3500 | 0.0204 | 0.0889 | 6.98% |
52
+
53
+ - **Final WER**: **6.98%** (at step 3500)