Training in progress, step 2000

Browse files

Files changed (9) hide show

README.md +14 -43
all_results.json +12 -12
config.json +1 -1
eval_results.json +7 -7
indicwav2vec_trainwtags_MUCS_warmup2000_s300shuff100_2217857.out +0 -0
predictionswtags_indicw2v_ad0_3_hd_02_featd_0_2_lr6e-4_warmup2000_s300_shuff100_v2.txt +0 -0
train_results.json +6 -6
trainer_state.json +0 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -11,14 +11,14 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/priyanshipal/huggingface/runs/ovuq4xne)
 # output
 This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.0007
-- Cer: 0.3962
-- Wer: 0.6069
 ## Model description
@@ -45,49 +45,20 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 1500
-- training_steps: 3500
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Cer    | Wer    |
-|:-------------:|:------:|:----:|:---------------:|:------:|:------:|
-| 5.2909        | 0.16   | 100  | 5.7883          | 0.9933 | 1.0    |
-| 4.4471        | 0.32   | 200  | 3.7597          | 0.9812 | 1.0    |
-| 3.6763        | 0.48   | 300  | 3.7718          | 0.9769 | 1.0    |
-| 3.1272        | 0.64   | 400  | 3.4647          | 0.9826 | 1.0    |
-| 3.0699        | 0.8    | 500  | 2.7480          | 0.6002 | 0.8337 |
-| 1.7389        | 0.96   | 600  | 2.4468          | 0.4717 | 0.6782 |
-| 1.7013        | 1.12   | 700  | 2.1151          | 0.4489 | 0.6820 |
-| 1.2995        | 1.28   | 800  | 2.0757          | 0.4160 | 0.6245 |
-| 1.6852        | 1.44   | 900  | 1.9870          | 0.4112 | 0.6154 |
-| 1.3997        | 1.6    | 1000 | 2.0007          | 0.3962 | 0.6069 |
-| 1.768         | 1.76   | 1100 | 2.0712          | 0.4123 | 0.6448 |
-| 2.5192        | 1.92   | 1200 | 2.5729          | 0.6884 | 0.9178 |
-| 2.6077        | 2.08   | 1300 | 2.4078          | 0.4816 | 0.8066 |
-| 2.6928        | 2.24   | 1400 | 2.3596          | 0.4904 | 0.7915 |
-| 0.0           | 2.4    | 1500 | 2.4471          | 0.6019 | 0.8782 |
-| 0.0           | 2.56   | 1600 | 2.4490          | 0.6112 | 0.8888 |
-| 0.0           | 2.7200 | 1700 | nan             | 1.0    | 1.0    |
-| 0.0           | 2.88   | 1800 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.04   | 1900 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.2    | 2000 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.36   | 2100 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.52   | 2200 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.68   | 2300 | nan             | 1.0    | 1.0    |
-| 0.0           | 3.84   | 2400 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.0    | 2500 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.16   | 2600 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.32   | 2700 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.48   | 2800 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.64   | 2900 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.8    | 3000 | nan             | 1.0    | 1.0    |
-| 0.0           | 4.96   | 3100 | nan             | 1.0    | 1.0    |
-| 0.0           | 5.12   | 3200 | nan             | 1.0    | 1.0    |
-| 0.0           | 5.28   | 3300 | nan             | 1.0    | 1.0    |
-| 0.0           | 5.44   | 3400 | nan             | 1.0    | 1.0    |
-| 0.0           | 5.6    | 3500 | nan             | 1.0    | 1.0    |
 ### Framework versions

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/priyanshipal/huggingface/runs/hvo6b5jz)
 # output
 This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.9947
+- Cer: 0.4133
+- Wer: 0.6195
 ## Model description
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 2000
+- training_steps: 6000
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Cer    | Wer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
+| 1.5851        | 1.6   | 1000 | 1.9947          | 0.4133 | 0.6195 |
+| 1.8352        | 3.2   | 2000 | 2.1491          | 0.4724 | 0.7895 |
+| 2.3755        | 4.8   | 3000 | 2.3793          | 0.4433 | 0.7270 |
+| 3.3134        | 6.4   | 4000 | 3.3025          | 0.5204 | 0.8033 |
+| 3.4098        | 8.0   | 5000 | 3.2885          | 0.5196 | 0.8050 |
+| 3.1155        | 9.6   | 6000 | 3.2885          | 0.5196 | 0.8050 |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,16 +1,16 @@
 {
-    "epoch": 5.6,
-    "eval_cer": 0.39621716426834613,
-    "eval_loss": 2.00065016746521,
-    "eval_runtime": 157.5611,
     "eval_samples": 3136,
-    "eval_samples_per_second": 19.903,
-    "eval_steps_per_second": 1.244,
-    "eval_wer": 0.6068801160501502,
-    "total_flos": 2.158464150901847e+19,
-    "train_loss": 1.3790143143790108,
-    "train_runtime": 12492.843,
     "train_samples": 20000,
-    "train_samples_per_second": 8.965,
-    "train_steps_per_second": 0.28
 }

 {
+    "epoch": 9.6,
+    "eval_cer": 0.4133216406903974,
+    "eval_loss": 1.9947007894515991,
+    "eval_runtime": 158.2803,
     "eval_samples": 3136,
+    "eval_samples_per_second": 19.813,
+    "eval_steps_per_second": 1.238,
+    "eval_wer": 0.6194798466480157,
+    "total_flos": 3.700768773245485e+19,
+    "train_loss": 2.825458660195271,
+    "train_runtime": 12931.7591,
     "train_samples": 20000,
+    "train_samples_per_second": 14.847,
+    "train_steps_per_second": 0.464
 }

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "/scratch/elec/puhe/p/palp3/MUCS/mucs_language_segregated_data/trainwithtags_warmup1500_s300_shuff100/output",
   "activation_dropout": 0.0,
   "adapter_attn_dim": null,
   "adapter_kernel_size": 3,

 {
+  "_name_or_path": "/scratch/elec/puhe/p/palp3/MUCS/mucs_language_segregated_data/trainwithtags_warmup2000_s300_shuff100/output",
   "activation_dropout": 0.0,
   "adapter_attn_dim": null,
   "adapter_kernel_size": 3,

eval_results.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
-    "epoch": 5.6,
-    "eval_cer": 0.39621716426834613,
-    "eval_loss": 2.00065016746521,
-    "eval_runtime": 157.5611,
     "eval_samples": 3136,
-    "eval_samples_per_second": 19.903,
-    "eval_steps_per_second": 1.244,
-    "eval_wer": 0.6068801160501502
 }

 {
+    "epoch": 9.6,
+    "eval_cer": 0.4133216406903974,
+    "eval_loss": 1.9947007894515991,
+    "eval_runtime": 158.2803,
     "eval_samples": 3136,
+    "eval_samples_per_second": 19.813,
+    "eval_steps_per_second": 1.238,
+    "eval_wer": 0.6194798466480157
 }

indicwav2vec_trainwtags_MUCS_warmup2000_s300shuff100_2217857.out CHANGED Viewed

The diff for this file is too large to render. See raw diff

predictionswtags_indicw2v_ad0_3_hd_02_featd_0_2_lr6e-4_warmup2000_s300_shuff100_v2.txt CHANGED Viewed

The diff for this file is too large to render. See raw diff

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 5.6,
-    "total_flos": 2.158464150901847e+19,
-    "train_loss": 1.3790143143790108,
-    "train_runtime": 12492.843,
     "train_samples": 20000,
-    "train_samples_per_second": 8.965,
-    "train_steps_per_second": 0.28
 }

 {
+    "epoch": 9.6,
+    "total_flos": 3.700768773245485e+19,
+    "train_loss": 2.825458660195271,
+    "train_runtime": 12931.7591,
     "train_samples": 20000,
+    "train_samples_per_second": 14.847,
+    "train_steps_per_second": 0.464
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:225187752662e438525246e50709b06bc314c78e599731793e3282deaa2b7361
 size 5496

 version https://git-lfs.github.com/spec/v1
+oid sha256:822e5af645804ffdd10b6af8f294a5fe41b2d528c7008dddd02ad135f812ddc0
 size 5496