File size: 11,883 Bytes
9bd6d06
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c75a43d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
wandb: Currently logged in as: priyanshi-pal (priyanshipal). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.17.7 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.17.6
wandb: Run data is saved locally in /scratch/elec/t405-puhe/p/palp3/MUCS/wandb/run-20240822_150154-1kodfy70
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run eval_pd2000_s300_shuff100_hindi
wandb: ⭐️ View project at https://wandb.ai/priyanshipal/huggingface
wandb: 🚀 View run at https://wandb.ai/priyanshipal/huggingface/runs/1kodfy70
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead
  warnings.warn(
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py:957: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/models/auto/feature_extraction_auto.py:329: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
/scratch/work/palp3/myenv/lib/python3.11/site-packages/accelerate/accelerator.py:488: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)
max_steps is given, it will override any value given in num_train_epochs
Wav2Vec2CTCTokenizer(name_or_path='', vocab_size=149, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '[UNK]', 'pad_token': '[PAD]'}, clean_up_tokenization_spaces=True),  added_tokens_decoder={
	147: AddedToken("[UNK]", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
	148: AddedToken("[PAD]", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
	149: AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	150: AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
CHECK MODEL PARAMS Wav2Vec2ForCTC(
  (wav2vec2): Wav2Vec2Model(
    (feature_extractor): Wav2Vec2FeatureEncoder(
      (conv_layers): ModuleList(
        (0): Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(1, 512, kernel_size=(10,), stride=(5,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
        (1-4): 4 x Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
        (5-6): 2 x Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(512, 512, kernel_size=(2,), stride=(2,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
      )
    )
    (feature_projection): Wav2Vec2FeatureProjection(
      (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
      (projection): Linear(in_features=512, out_features=1024, bias=True)
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): Wav2Vec2EncoderStableLayerNorm(
      (pos_conv_embed): Wav2Vec2PositionalConvEmbedding(
        (conv): ParametrizedConv1d(
          1024, 1024, kernel_size=(128,), stride=(1,), padding=(64,), groups=16
          (parametrizations): ModuleDict(
            (weight): ParametrizationList(
              (0): _WeightNorm()
            )
          )
        )
        (padding): Wav2Vec2SamePadLayer()
        (activation): GELUActivation()
      )
      (layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.0, inplace=False)
      (layers): ModuleList(
        (0-23): 24 x Wav2Vec2EncoderLayerStableLayerNorm(
          (attention): Wav2Vec2SdpaAttention(
            (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
          )
          (dropout): Dropout(p=0.0, inplace=False)
          (layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
          (feed_forward): Wav2Vec2FeedForward(
            (intermediate_dropout): Dropout(p=0.0, inplace=False)
            (intermediate_dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
            (output_dense): Linear(in_features=4096, out_features=1024, bias=True)
            (output_dropout): Dropout(p=0.0, inplace=False)
          )
          (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        )
      )
    )
  )
  (dropout): Dropout(p=0.0, inplace=False)
  (lm_head): Linear(in_features=1024, out_features=151, bias=True)
)
08/22/2024 15:02:06 - INFO - __main__ - *** Evaluate ***
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/models/wav2vec2/processing_wav2vec2.py:157: UserWarning: `as_target_processor` is deprecated and will be removed in v5 of Transformers. You can process your labels by using the argument `text` of the regular `__call__` method (either in the same call as your audio inputs, or in a separate call.
  warnings.warn(

  0%|          | 0/36 [00:00<?, ?it/s]
  6%|▌         | 2/36 [00:01<00:26,  1.27it/s]
  8%|▊         | 3/36 [00:02<00:33,  1.02s/it]
 11%|█         | 4/36 [00:04<00:41,  1.30s/it]
 14%|█▍        | 5/36 [00:06<00:43,  1.40s/it]
 17%|█▋        | 6/36 [00:07<00:41,  1.39s/it]
 19%|█▉        | 7/36 [00:08<00:35,  1.22s/it]
 22%|██▏       | 8/36 [00:09<00:27,  1.01it/s]
 25%|██▌       | 9/36 [00:09<00:23,  1.17it/s]
 28%|██▊       | 10/36 [00:10<00:21,  1.22it/s]
 31%|███       | 11/36 [00:11<00:20,  1.20it/s]
 33%|███▎      | 12/36 [00:11<00:19,  1.21it/s]
 36%|███▌      | 13/36 [00:12<00:18,  1.23it/s]
 39%|███▉      | 14/36 [00:13<00:15,  1.43it/s]
 42%|████▏     | 15/36 [00:13<00:12,  1.63it/s]
 44%|████▍     | 16/36 [00:14<00:11,  1.79it/s]
 47%|████▋     | 17/36 [00:14<00:10,  1.89it/s]
 50%|█████     | 18/36 [00:15<00:10,  1.79it/s]
 53%|█████▎    | 19/36 [00:15<00:09,  1.74it/s]
 56%|█████▌    | 20/36 [00:16<00:08,  1.79it/s]
 58%|█████▊    | 21/36 [00:16<00:07,  1.94it/s]
 61%|██████    | 22/36 [00:17<00:07,  1.94it/s]
 64%|██████▍   | 23/36 [00:17<00:06,  1.91it/s]
 67%|██████▋   | 24/36 [00:18<00:06,  1.88it/s]
 69%|██████▉   | 25/36 [00:18<00:06,  1.81it/s]
 72%|███████▏  | 26/36 [00:19<00:05,  1.84it/s]
 75%|███████▌  | 27/36 [00:19<00:04,  1.94it/s]
 78%|███████▊  | 28/36 [00:20<00:05,  1.59it/s]
 81%|████████  | 29/36 [00:22<00:06,  1.08it/s]
 83%|████████▎ | 30/36 [00:23<00:06,  1.08s/it]
 86%|████████▌ | 31/36 [00:25<00:06,  1.35s/it]
 89%|████████▉ | 32/36 [00:26<00:04,  1.13s/it]
 92%|█████████▏| 33/36 [00:27<00:02,  1.03it/s]
 94%|█████████▍| 34/36 [00:27<00:01,  1.20it/s]
 97%|█████████▋| 35/36 [00:28<00:00,  1.35it/s]
100%|██████████| 36/36 [00:28<00:00,  1.65it/s]
100%|██████████| 36/36 [00:30<00:00,  1.20it/s]
Printing predictions for a few samples:
Sample 1:
  Reference: हम उनका उपयोग ऐसे ही कर सकते हैं या आवश्यकता अनुसार कुछ बदलाव करके उपयोग कर सकते हैं
######


  Prediction: 



Sample 2:
  Reference: अतः शीर्षक इस तरह से जोड़ सकते हैं
######


  Prediction: 



Sample 3:
  Reference: प्रेसेंटेशन के अंत में आपने स्लाइड की एक कॉपी बना ली है
######


  Prediction: 



Sample 4:
  Reference: चलिए अब फोंट्स और फोंट्स को फॉर्मेट करने के कुछ तरीके देखते हैं
######


  Prediction: 



Sample 5:
  Reference: यह एक डायलॉग बॉक्स खोलेगा जिसमें हम अपनी आवश्यकतानुसार फॉन्ट स्टाइल और साइज़ सेट कर सकते हैं
######


  Prediction: 



last Reference string यह स्क्रिप्ट लता द्वारा अनुवादित है आईआईटी मुंबई की ओर से मैं रवि कुमार अब आपसे विदा लेता हूँहमसे जुड़ने के लिए धन्यवाद


last prediction string 
***** eval metrics *****
  eval_cer                    =        1.0
  eval_loss                   =        nan
  eval_model_preparation_time =     0.0046
  eval_runtime                = 0:00:39.88
  eval_samples                =        572
  eval_samples_per_second     =      14.34
  eval_steps_per_second       =      0.902
  eval_wer                    =        1.0

training_args.bin:   0%|          | 0.00/5.43k [00:00<?, ?B/s]
training_args.bin: 100%|██████████| 5.43k/5.43k [00:00<00:00, 28.4kB/s]
wandb: - 0.005 MB of 0.005 MB uploaded
wandb: \ 0.036 MB of 0.036 MB uploaded
wandb: 
wandb: Run history:
wandb:                    eval/cer ▁
wandb: eval/model_preparation_time ▁
wandb:                eval/runtime ▁
wandb:     eval/samples_per_second ▁
wandb:       eval/steps_per_second ▁
wandb:                    eval/wer ▁
wandb:                    eval_cer ▁
wandb: eval_model_preparation_time ▁
wandb:                eval_runtime ▁
wandb:                eval_samples ▁
wandb:     eval_samples_per_second ▁
wandb:       eval_steps_per_second ▁
wandb:                    eval_wer ▁
wandb:           train/global_step ▁▁
wandb: 
wandb: Run summary:
wandb:                    eval/cer 1.0
wandb:                   eval/loss nan
wandb: eval/model_preparation_time 0.0046
wandb:                eval/runtime 39.8895
wandb:     eval/samples_per_second 14.34
wandb:       eval/steps_per_second 0.902
wandb:                    eval/wer 1.0
wandb:                    eval_cer 1.0
wandb:                   eval_loss nan
wandb: eval_model_preparation_time 0.0046
wandb:                eval_runtime 39.8895
wandb:                eval_samples 572
wandb:     eval_samples_per_second 14.34
wandb:       eval_steps_per_second 0.902
wandb:                    eval_wer 1.0
wandb:           train/global_step 0
wandb: 
wandb: 🚀 View run eval_pd2000_s300_shuff100_hindi at: https://wandb.ai/priyanshipal/huggingface/runs/1kodfy70
wandb: ⭐️ View project at: https://wandb.ai/priyanshipal/huggingface
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240822_150154-1kodfy70/logs
wandb: WARNING The new W&B backend becomes opt-out in version 0.18.0; try it out with `wandb.require("core")`! See https://wandb.me/wandb-core for more information.