File size: 11,297 Bytes
c75a43d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
wandb: Currently logged in as: priyanshi-pal (priyanshipal). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.17.7 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.17.6
wandb: Run data is saved locally in /scratch/elec/t405-puhe/p/palp3/MUCS/wandb/run-20240822_151437-2b363w6i
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run eval_pd2000_s300_shuff100_hindi
wandb: ⭐️ View project at https://wandb.ai/priyanshipal/huggingface
wandb: πŸš€ View run at https://wandb.ai/priyanshipal/huggingface/runs/2b363w6i
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of πŸ€— Transformers. Use `eval_strategy` instead
  warnings.warn(
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py:957: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
/scratch/work/palp3/myenv/lib/python3.11/site-packages/transformers/models/auto/feature_extraction_auto.py:329: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
Wav2Vec2CTCTokenizer(name_or_path='', vocab_size=149, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '[UNK]', 'pad_token': '[PAD]'}, clean_up_tokenization_spaces=True),  added_tokens_decoder={
	147: AddedToken("[UNK]", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
	148: AddedToken("[PAD]", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
	149: AddedToken("<s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	150: AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
CHECK MODEL PARAMS Wav2Vec2ForCTC(
  (wav2vec2): Wav2Vec2Model(
    (feature_extractor): Wav2Vec2FeatureEncoder(
      (conv_layers): ModuleList(
        (0): Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(1, 512, kernel_size=(10,), stride=(5,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
        (1-4): 4 x Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(512, 512, kernel_size=(3,), stride=(2,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
        (5-6): 2 x Wav2Vec2LayerNormConvLayer(
          (conv): Conv1d(512, 512, kernel_size=(2,), stride=(2,))
          (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation): GELUActivation()
        )
      )
    )
    (feature_projection): Wav2Vec2FeatureProjection(
      (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
      (projection): Linear(in_features=512, out_features=1024, bias=True)
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): Wav2Vec2EncoderStableLayerNorm(
      (pos_conv_embed): Wav2Vec2PositionalConvEmbedding(
        (conv): ParametrizedConv1d(
          1024, 1024, kernel_size=(128,), stride=(1,), padding=(64,), groups=16
          (parametrizations): ModuleDict(
            (weight): ParametrizationList(
              (0): _WeightNorm()
            )
          )
        )
        (padding): Wav2Vec2SamePadLayer()
        (activation): GELUActivation()
      )
      (layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.0, inplace=False)
      (layers): ModuleList(
        (0-23): 24 x Wav2Vec2EncoderLayerStableLayerNorm(
          (attention): Wav2Vec2SdpaAttention(
            (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
          )
          (dropout): Dropout(p=0.0, inplace=False)
          (layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
          (feed_forward): Wav2Vec2FeedForward(
            (intermediate_dropout): Dropout(p=0.0, inplace=False)
            (intermediate_dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
            (output_dense): Linear(in_features=4096, out_features=1024, bias=True)
            (output_dropout): Dropout(p=0.0, inplace=False)
          )
          (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        )
      )
    )
  )
  (dropout): Dropout(p=0.0, inplace=False)
  (lm_head): Linear(in_features=1024, out_features=151, bias=True)
)

preprocess datasets:   0%|          | 0/572 [00:00<?, ? examples/s]
preprocess datasets:   0%|          | 1/572 [00:06<1:03:00,  6.62s/ examples]
preprocess datasets:   2%|▏         | 10/572 [00:06<04:35,  2.04 examples/s] 
preprocess datasets:   3%|β–Ž         | 18/572 [00:06<02:08,  4.30 examples/s]
preprocess datasets:   4%|▍         | 25/572 [00:06<01:19,  6.88 examples/s]
preprocess datasets:   6%|β–Œ         | 32/572 [00:07<00:53, 10.17 examples/s]
preprocess datasets:   7%|β–‹         | 40/572 [00:07<00:35, 15.06 examples/s]
preprocess datasets:   9%|β–Š         | 49/572 [00:07<00:24, 21.57 examples/s]
preprocess datasets:  10%|β–ˆ         | 60/572 [00:07<00:16, 30.84 examples/s]
preprocess datasets:  12%|β–ˆβ–        | 68/572 [00:07<00:15, 33.57 examples/s]
preprocess datasets:  13%|β–ˆβ–Ž        | 77/572 [00:07<00:12, 38.25 examples/s]
preprocess datasets:  15%|β–ˆβ–        | 85/572 [00:07<00:11, 41.33 examples/s]
preprocess datasets:  17%|β–ˆβ–‹        | 95/572 [00:08<00:10, 46.52 examples/s]
preprocess datasets:  18%|β–ˆβ–Š        | 103/572 [00:08<00:09, 51.61 examples/s]
preprocess datasets:  20%|β–ˆβ–‰        | 112/572 [00:08<00:07, 57.65 examples/s]
preprocess datasets:  22%|β–ˆβ–ˆβ–       | 123/572 [00:08<00:06, 66.80 examples/s]
preprocess datasets:  24%|β–ˆβ–ˆβ–       | 136/572 [00:08<00:05, 79.78 examples/s]
preprocess datasets:  27%|β–ˆβ–ˆβ–‹       | 152/572 [00:08<00:04, 97.38 examples/s]
preprocess datasets:  28%|β–ˆβ–ˆβ–Š       | 163/572 [00:08<00:04, 97.98 examples/s]
preprocess datasets:  31%|β–ˆβ–ˆβ–ˆ       | 175/572 [00:08<00:03, 100.03 examples/s]
preprocess datasets:  33%|β–ˆβ–ˆβ–ˆβ–Ž      | 186/572 [00:09<00:03, 97.70 examples/s] 
preprocess datasets:  35%|β–ˆβ–ˆβ–ˆβ–Œ      | 203/572 [00:09<00:03, 94.71 examples/s]
preprocess datasets:  38%|β–ˆβ–ˆβ–ˆβ–Š      | 215/572 [00:09<00:03, 93.69 examples/s]
preprocess datasets:  40%|β–ˆβ–ˆβ–ˆβ–ˆ      | 231/572 [00:09<00:03, 106.11 examples/s]
preprocess datasets:  44%|β–ˆβ–ˆβ–ˆβ–ˆβ–     | 252/572 [00:09<00:02, 114.38 examples/s]
preprocess datasets:  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹     | 269/572 [00:09<00:02, 107.29 examples/s]
preprocess datasets:  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰     | 285/572 [00:09<00:02, 108.60 examples/s]
preprocess datasets:  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–    | 299/572 [00:10<00:02, 98.64 examples/s] 
preprocess datasets:  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–    | 313/572 [00:10<00:02, 89.67 examples/s]
preprocess datasets:  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹    | 324/572 [00:10<00:02, 92.52 examples/s]
preprocess datasets:  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š    | 335/572 [00:10<00:02, 94.83 examples/s]
preprocess datasets:  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    | 347/572 [00:10<00:02, 100.27 examples/s]
preprocess datasets:  64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–   | 365/572 [00:10<00:01, 103.53 examples/s]
preprocess datasets:  66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ   | 377/572 [00:10<00:01, 106.34 examples/s]
preprocess datasets:  68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š   | 390/572 [00:10<00:01, 108.09 examples/s]
preprocess datasets:  70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   | 403/572 [00:11<00:01, 109.44 examples/s]
preprocess datasets:  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž  | 421/572 [00:11<00:01, 107.79 examples/s]
preprocess datasets:  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ  | 435/572 [00:11<00:01, 103.09 examples/s]
preprocess datasets:  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰  | 451/572 [00:11<00:01, 116.05 examples/s]
preprocess datasets:  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  | 464/572 [00:11<00:01, 102.90 examples/s]
preprocess datasets:  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 475/572 [00:11<00:01, 80.32 examples/s] 
preprocess datasets:  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/572 [00:12<00:01, 73.58 examples/s]
preprocess datasets:  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 498/572 [00:12<00:00, 84.46 examples/s]
preprocess datasets:  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 513/572 [00:12<00:00, 97.50 examples/s]
preprocess datasets:  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 524/572 [00:12<00:00, 88.18 examples/s]
preprocess datasets:  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 535/572 [00:12<00:00, 88.16 examples/s]
preprocess datasets:  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 550/572 [00:12<00:00, 96.97 examples/s]
preprocess datasets:  99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 568/572 [00:12<00:00, 96.23 examples/s]
preprocess datasets: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 572/572 [00:13<00:00, 42.21 examples/s]
Traceback (most recent call last):
  File "/scratch/elec/puhe/p/palp3/MUCS/eval_script_indicwav2vec.py", line 790, in <module>
    main()
  File "/scratch/elec/puhe/p/palp3/MUCS/eval_script_indicwav2vec.py", line 637, in main
    print("check the eval set length", len(vectorized_datasets["eval"]["audio_id"]))
                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/scratch/work/palp3/myenv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 2866, in __getitem__
    return self._getitem(key)
           ^^^^^^^^^^^^^^^^^^
  File "/scratch/work/palp3/myenv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 2850, in _getitem
    pa_subtable = query_table(self._data, key, indices=self._indices)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/work/palp3/myenv/lib/python3.11/site-packages/datasets/formatting/formatting.py", line 584, in query_table
    _check_valid_column_key(key, table.column_names)
  File "/scratch/work/palp3/myenv/lib/python3.11/site-packages/datasets/formatting/formatting.py", line 521, in _check_valid_column_key
    raise KeyError(f"Column {key} not in the dataset. Current columns in the dataset: {columns}")
KeyError: "Column audio_id not in the dataset. Current columns in the dataset: ['input_values', 'input_length', 'labels']"
wandb: - 0.011 MB of 0.011 MB uploaded
wandb: \ 0.011 MB of 0.028 MB uploaded
wandb: πŸš€ View run eval_pd2000_s300_shuff100_hindi at: https://wandb.ai/priyanshipal/huggingface/runs/2b363w6i
wandb: ⭐️ View project at: https://wandb.ai/priyanshipal/huggingface
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240822_151437-2b363w6i/logs
wandb: WARNING The new W&B backend becomes opt-out in version 0.18.0; try it out with `wandb.require("core")`! See https://wandb.me/wandb-core for more information.