alibabasglab commited on
Commit
a2d0eb0
·
verified ·
1 Parent(s): 9f484dd

Upload 15 files

Browse files
Files changed (15) hide show
  1. checkpoints/.DS_Store +0 -0
  2. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/config.yaml +54 -0
  3. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/last_best_checkpoint.pt +3 -0
  4. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/last_checkpoint.pt +3 -0
  5. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/log_2024-11-12(09:27:22).txt +617 -0
  6. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731374941.dlcf4k2knsh01f6k-master-0.28.0 +3 -0
  7. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731483665.dlcf4k2knsh01f6k-master-0.26.0 +3 -0
  8. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731487228.dlcf4k2knsh01f6k-master-0.26.0 +3 -0
  9. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731492013.dlcf4k2knsh01f6k-master-0.26.0 +3 -0
  10. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731546582.dlc199i687psn18d-master-0.26.0 +3 -0
  11. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731548123.dlc9mw1l3osem0g9-master-0.1183013.0 +3 -0
  12. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731548338.dlc9mw1l3osem0g9-master-0.1187202.0 +3 -0
  13. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731557306.dlc1yk2tc721dlue-master-0.26.0 +3 -0
  14. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1732075802.dlc1g7yr5z0x4h2g-master-0.26.0 +3 -0
  15. checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1732583305.dlc1gzcv7row61qr-master-0.26.0 +3 -0
checkpoints/.DS_Store ADDED
Binary file (6.15 kB). View file
 
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/config.yaml ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Config file
2
+
3
+ # Log
4
+ seed: 777
5
+ use_cuda: 1 # 1 for True, 0 for False
6
+
7
+ # dataset
8
+ speaker_no: 3
9
+ mix_lst_path: ./data/VoxCeleb2/mixture_data_list_3mix.csv
10
+ audio_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/audio_clean/
11
+ reference_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/orig/
12
+ audio_sr: 16000
13
+ ref_sr: 25
14
+
15
+ # dataloader
16
+ num_workers: 4
17
+ batch_size: 2 # 4-GPU training with a total effective batch size of 8
18
+ accu_grad: 0
19
+ effec_batch_size: 2 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
20
+ max_length: 3 # truncate the utterances in dataloader, in seconds
21
+
22
+ # network settings
23
+ init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
24
+ causal: 0 # 1 for True, 0 for False
25
+ network_reference:
26
+ cue: lip # lip or speech or gesture or EEG
27
+ backbone: resnet18 # resnet18 or shufflenetV2 or blazenet64
28
+ emb_size: 256 # resnet18:256
29
+ network_audio:
30
+ backbone: av_mossformer2
31
+ encoder_kernel_size: 16
32
+ encoder_out_nchannels: 512
33
+ encoder_in_nchannels: 1
34
+
35
+ masknet_numspks: 1
36
+ masknet_chunksize: 250
37
+ masknet_numlayers: 1
38
+ masknet_norm: "ln"
39
+ masknet_useextralinearlayer: False
40
+ masknet_extraskipconnection: True
41
+
42
+ intra_numlayers: 24
43
+ intra_nhead: 8
44
+ intra_dffn: 1024
45
+ intra_dropout: 0
46
+ intra_use_positional: True
47
+ intra_norm_before: True
48
+
49
+
50
+ # optimizer
51
+ loss_type: sisdr # "snr", "sisdr", "hybrid"
52
+ init_learning_rate: 0.00015
53
+ max_epoch: 150
54
+ clip_grad_norm: 5
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/last_best_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b43436d73ce61d73b3879db8d28a13c0ecbaef8ab35b576f7fb816bb51e3e82c
3
+ size 734561014
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/last_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0fdaf3c8f319bd29825df4211bd120b2b8dcf74ccbc02c0b0cfb345716380465
3
+ size 734537584
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/log_2024-11-12(09:27:22).txt ADDED
@@ -0,0 +1,617 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Config file
2
+
3
+ # Log
4
+ seed: 777
5
+ use_cuda: 1 # 1 for True, 0 for False
6
+
7
+ # dataset
8
+ speaker_no: 3
9
+ mix_lst_path: ./data/VoxCeleb2/mixture_data_list_3mix.csv
10
+ audio_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/audio_clean/
11
+ reference_direc: /mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/orig/
12
+ audio_sr: 16000
13
+ ref_sr: 25
14
+
15
+ # dataloader
16
+ num_workers: 4
17
+ batch_size: 2 # 2-GPU training with a total effective batch size of 8
18
+ accu_grad: 1
19
+ effec_batch_size: 4 # per GPU, only used if accu_grad is set to 1, must be multiple times of batch size
20
+ max_length: 3 # truncate the utterances in dataloader, in seconds
21
+
22
+ # network settings
23
+ init_from: None # 'None' or a log name 'log_2024-07-22(18:12:13)'
24
+ causal: 0 # 1 for True, 0 for False
25
+ network_reference:
26
+ cue: lip # lip or speech or gesture or EEG
27
+ backbone: resnet18 # resnet18 or shufflenetV2 or blazenet64
28
+ emb_size: 256 # resnet18:256
29
+ network_audio:
30
+ backbone: av_mossformer2
31
+ encoder_kernel_size: 16
32
+ encoder_out_nchannels: 512
33
+ encoder_in_nchannels: 1
34
+
35
+ masknet_numspks: 1
36
+ masknet_chunksize: 250
37
+ masknet_numlayers: 1
38
+ masknet_norm: "ln"
39
+ masknet_useextralinearlayer: False
40
+ masknet_extraskipconnection: True
41
+
42
+ intra_numlayers: 24
43
+ intra_nhead: 8
44
+ intra_dffn: 1024
45
+ intra_dropout: 0
46
+ intra_use_positional: True
47
+ intra_norm_before: True
48
+
49
+
50
+ # optimizer
51
+ loss_type: sisdr # "snr", "sisdr", "hybrid"
52
+ init_learning_rate: 0.00015
53
+ max_epoch: 150
54
+ clip_grad_norm: 5
55
+ W1112 09:27:54.432088 139873319929664 torch/distributed/run.py:779]
56
+ W1112 09:27:54.432088 139873319929664 torch/distributed/run.py:779] *****************************************
57
+ W1112 09:27:54.432088 139873319929664 torch/distributed/run.py:779] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
58
+ W1112 09:27:54.432088 139873319929664 torch/distributed/run.py:779] *****************************************
59
+ started on checkpoints/log_2024-11-12(09:27:22)
60
+
61
+ namespace(accu_grad=1, audio_direc='/mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/audio_clean/', audio_sr=16000, batch_size=2, causal=0, checkpoint_dir='checkpoints/log_2024-11-12(09:27:22)', clip_grad_norm=5.0, config=[<yamlargparse.Path object at 0x7ff03ef79c10>], device=device(type='cuda'), distributed=True, effec_batch_size=4, evaluate_only=0, init_from='None', init_learning_rate=0.00015, local_rank=0, loss_type='sisdr', lr_warmup=0, max_epoch=150, max_length=3, mix_lst_path='./data/VoxCeleb2/mixture_data_list_3mix.csv', network_audio=namespace(backbone='av_mossformer2', encoder_in_nchannels=1, encoder_kernel_size=16, encoder_out_nchannels=512, intra_dffn=1024, intra_dropout=0, intra_nhead=8, intra_norm_before=True, intra_numlayers=24, intra_use_positional=True, masknet_chunksize=250, masknet_extraskipconnection=True, masknet_norm='ln', masknet_numlayers=1, masknet_numspks=1, masknet_useextralinearlayer=False), network_reference=namespace(backbone='resnet18', cue='lip', emb_size=256), num_workers=4, ref_sr=25, reference_direc='/mnt/nas_sg/wulanchabu/zexu.pan/datasets/VoxCeleb2/orig/', seed=777, speaker_no=3, train_from_last_checkpoint=0, use_cuda=1, world_size=2)
62
+ network_wrapper(
63
+ (sep_network): av_Mossformer(
64
+ (encoder): Encoder(
65
+ (conv1d_U): Conv1d(1, 512, kernel_size=(16,), stride=(8,), bias=False)
66
+ )
67
+ (separator): Separator(
68
+ (layer_norm): GroupNorm(1, 512, eps=1e-08, affine=True)
69
+ (bottleneck_conv1x1): Conv1d(512, 512, kernel_size=(1,), stride=(1,), bias=False)
70
+ (masknet): Dual_Path_Model(
71
+ (pos_enc): ScaledSinuEmbedding()
72
+ (dual_mdl): ModuleList(
73
+ (0): Dual_Computation_Block(
74
+ (intra_mdl): SBFLASHBlock_DualA(
75
+ (mdl): TransformerEncoder_FLASH_DualA_FSMN(
76
+ (flashT): FLASHTransformer_DualA_FSMN(
77
+ (fsmn): ModuleList(
78
+ (0-23): 24 x Gated_FSMN_Block_Dilated(
79
+ (conv1): Sequential(
80
+ (0): Conv1d(512, 256, kernel_size=(1,), stride=(1,))
81
+ (1): PReLU(num_parameters=1)
82
+ )
83
+ (norm1): CLayerNorm((256,), eps=1e-05, elementwise_affine=True)
84
+ (gated_fsmn): Gated_FSMN_dilated(
85
+ (to_u): FFConvM(
86
+ (mdl): Sequential(
87
+ (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
88
+ (1): Linear(in_features=256, out_features=256, bias=True)
89
+ (2): SiLU()
90
+ (3): ConvModule(
91
+ (sequential): Sequential(
92
+ (0): Transpose()
93
+ (1): DepthwiseConv1d(
94
+ (conv): Conv1d(256, 256, kernel_size=(17,), stride=(1,), padding=(8,), groups=256, bias=False)
95
+ )
96
+ )
97
+ )
98
+ (4): Dropout(p=0.1, inplace=False)
99
+ )
100
+ )
101
+ (to_v): FFConvM(
102
+ (mdl): Sequential(
103
+ (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
104
+ (1): Linear(in_features=256, out_features=256, bias=True)
105
+ (2): SiLU()
106
+ (3): ConvModule(
107
+ (sequential): Sequential(
108
+ (0): Transpose()
109
+ (1): DepthwiseConv1d(
110
+ (conv): Conv1d(256, 256, kernel_size=(17,), stride=(1,), padding=(8,), groups=256, bias=False)
111
+ )
112
+ )
113
+ )
114
+ (4): Dropout(p=0.1, inplace=False)
115
+ )
116
+ )
117
+ (fsmn): UniDeepFsmn_dilated(
118
+ (linear): Linear(in_features=256, out_features=256, bias=True)
119
+ (project): Linear(in_features=256, out_features=256, bias=False)
120
+ (conv): DilatedDenseNet(
121
+ (pad): ConstantPad2d(padding=(1, 1, 1, 0), value=0.0)
122
+ (pad1): ConstantPad2d(padding=(0, 0, 19, 19), value=0.0)
123
+ (conv1): Conv2d(256, 256, kernel_size=(39, 1), stride=(1, 1), groups=256, bias=False)
124
+ (norm1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)
125
+ (prelu1): PReLU(num_parameters=256)
126
+ (pad2): ConstantPad2d(padding=(0, 0, 38, 38), value=0.0)
127
+ (conv2): Conv2d(512, 256, kernel_size=(39, 1), stride=(1, 1), dilation=(2, 1), groups=256, bias=False)
128
+ (norm2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)
129
+ (prelu2): PReLU(num_parameters=256)
130
+ )
131
+ )
132
+ )
133
+ (norm2): CLayerNorm((256,), eps=1e-05, elementwise_affine=True)
134
+ (conv2): Conv1d(256, 512, kernel_size=(1,), stride=(1,))
135
+ )
136
+ )
137
+ (layers): ModuleList(
138
+ (0-23): 24 x FLASH_ShareA_FFConvM(
139
+ (rotary_pos_emb): RotaryEmbedding()
140
+ (dropout): Dropout(p=0.1, inplace=False)
141
+ (to_hidden): FFConvM(
142
+ (mdl): Sequential(
143
+ (0): ScaleNorm()
144
+ (1): Linear(in_features=512, out_features=2048, bias=True)
145
+ (2): SiLU()
146
+ (3): ConvModule(
147
+ (sequential): Sequential(
148
+ (0): Transpose()
149
+ (1): DepthwiseConv1d(
150
+ (conv): Conv1d(2048, 2048, kernel_size=(17,), stride=(1,), padding=(8,), groups=2048, bias=False)
151
+ )
152
+ )
153
+ )
154
+ (4): Dropout(p=0.1, inplace=False)
155
+ )
156
+ )
157
+ (to_qk): FFConvM(
158
+ (mdl): Sequential(
159
+ (0): ScaleNorm()
160
+ (1): Linear(in_features=512, out_features=128, bias=True)
161
+ (2): SiLU()
162
+ (3): ConvModule(
163
+ (sequential): Sequential(
164
+ (0): Transpose()
165
+ (1): DepthwiseConv1d(
166
+ (conv): Conv1d(128, 128, kernel_size=(17,), stride=(1,), padding=(8,), groups=128, bias=False)
167
+ )
168
+ )
169
+ )
170
+ (4): Dropout(p=0.1, inplace=False)
171
+ )
172
+ )
173
+ (qk_offset_scale): OffsetScale()
174
+ (to_out): FFConvM(
175
+ (mdl): Sequential(
176
+ (0): ScaleNorm()
177
+ (1): Linear(in_features=1024, out_features=512, bias=True)
178
+ (2): SiLU()
179
+ (3): ConvModule(
180
+ (sequential): Sequential(
181
+ (0): Transpose()
182
+ (1): DepthwiseConv1d(
183
+ (conv): Conv1d(512, 512, kernel_size=(17,), stride=(1,), padding=(8,), groups=512, bias=False)
184
+ )
185
+ )
186
+ )
187
+ (4): Dropout(p=0.1, inplace=False)
188
+ )
189
+ )
190
+ (gateActivate): Sigmoid()
191
+ )
192
+ )
193
+ )
194
+ (norm): LayerNorm(
195
+ (norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
196
+ )
197
+ )
198
+ )
199
+ (intra_norm): GroupNorm(1, 512, eps=1e-08, affine=True)
200
+ )
201
+ )
202
+ (conv1d_out): Conv1d(512, 512, kernel_size=(1,), stride=(1,))
203
+ (conv1_decoder): Conv1d(512, 512, kernel_size=(1,), stride=(1,), bias=False)
204
+ (prelu): PReLU(num_parameters=1)
205
+ (activation): ReLU()
206
+ (output): Sequential(
207
+ (0): Conv1d(512, 512, kernel_size=(1,), stride=(1,))
208
+ (1): Tanh()
209
+ )
210
+ (output_gate): Sequential(
211
+ (0): Conv1d(512, 512, kernel_size=(1,), stride=(1,))
212
+ (1): Sigmoid()
213
+ )
214
+ )
215
+ (av_conv): Conv1d(768, 512, kernel_size=(1,), stride=(1,))
216
+ )
217
+ (decoder): Decoder(
218
+ (basis_signals): Linear(in_features=512, out_features=16, bias=False)
219
+ )
220
+ )
221
+ (ref_encoder): Visual_encoder(
222
+ (v_frontend): VisualFrontend(
223
+ (frontend3D): Sequential(
224
+ (0): Conv3d(1, 64, kernel_size=(5, 7, 7), stride=(1, 2, 2), padding=(2, 3, 3), bias=False)
225
+ (1): SyncBatchNorm(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
226
+ (2): ReLU()
227
+ (3): MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=(0, 1, 1), dilation=1, ceil_mode=False)
228
+ )
229
+ (resnet): ResNet(
230
+ (layer1): ResNetLayer(
231
+ (conv1a): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
232
+ (bn1a): SyncBatchNorm(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
233
+ (conv2a): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
234
+ (downsample): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
235
+ (outbna): SyncBatchNorm(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
236
+ (conv1b): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
237
+ (bn1b): SyncBatchNorm(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
238
+ (conv2b): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
239
+ (outbnb): SyncBatchNorm(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
240
+ )
241
+ (layer2): ResNetLayer(
242
+ (conv1a): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
243
+ (bn1a): SyncBatchNorm(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
244
+ (conv2a): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
245
+ (downsample): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
246
+ (outbna): SyncBatchNorm(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
247
+ (conv1b): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
248
+ (bn1b): SyncBatchNorm(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
249
+ (conv2b): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
250
+ (outbnb): SyncBatchNorm(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
251
+ )
252
+ (layer3): ResNetLayer(
253
+ (conv1a): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
254
+ (bn1a): SyncBatchNorm(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
255
+ (conv2a): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
256
+ (downsample): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
257
+ (outbna): SyncBatchNorm(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
258
+ (conv1b): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
259
+ (bn1b): SyncBatchNorm(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
260
+ (conv2b): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
261
+ (outbnb): SyncBatchNorm(256, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
262
+ )
263
+ (layer4): ResNetLayer(
264
+ (conv1a): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
265
+ (bn1a): SyncBatchNorm(512, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
266
+ (conv2a): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
267
+ (downsample): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
268
+ (outbna): SyncBatchNorm(512, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
269
+ (conv1b): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
270
+ (bn1b): SyncBatchNorm(512, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
271
+ (conv2b): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
272
+ (outbnb): SyncBatchNorm(512, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
273
+ )
274
+ (avgpool): AvgPool2d(kernel_size=(4, 4), stride=(1, 1), padding=0)
275
+ )
276
+ )
277
+ (v_ds): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
278
+ (visual_conv): Sequential(
279
+ (0): VisualConv1D(
280
+ (relu_0): ReLU()
281
+ (norm_0): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
282
+ (conv1x1): Conv1d(256, 512, kernel_size=(1,), stride=(1,), bias=False)
283
+ (relu): ReLU()
284
+ (norm_1): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
285
+ (dsconv): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), groups=512)
286
+ (prelu): PReLU(num_parameters=1)
287
+ (norm_2): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
288
+ (pw_conv): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
289
+ )
290
+ (1): VisualConv1D(
291
+ (relu_0): ReLU()
292
+ (norm_0): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
293
+ (conv1x1): Conv1d(256, 512, kernel_size=(1,), stride=(1,), bias=False)
294
+ (relu): ReLU()
295
+ (norm_1): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
296
+ (dsconv): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), groups=512)
297
+ (prelu): PReLU(num_parameters=1)
298
+ (norm_2): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
299
+ (pw_conv): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
300
+ )
301
+ (2): VisualConv1D(
302
+ (relu_0): ReLU()
303
+ (norm_0): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
304
+ (conv1x1): Conv1d(256, 512, kernel_size=(1,), stride=(1,), bias=False)
305
+ (relu): ReLU()
306
+ (norm_1): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
307
+ (dsconv): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), groups=512)
308
+ (prelu): PReLU(num_parameters=1)
309
+ (norm_2): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
310
+ (pw_conv): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
311
+ )
312
+ (3): VisualConv1D(
313
+ (relu_0): ReLU()
314
+ (norm_0): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
315
+ (conv1x1): Conv1d(256, 512, kernel_size=(1,), stride=(1,), bias=False)
316
+ (relu): ReLU()
317
+ (norm_1): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
318
+ (dsconv): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), groups=512)
319
+ (prelu): PReLU(num_parameters=1)
320
+ (norm_2): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
321
+ (pw_conv): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
322
+ )
323
+ (4): VisualConv1D(
324
+ (relu_0): ReLU()
325
+ (norm_0): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
326
+ (conv1x1): Conv1d(256, 512, kernel_size=(1,), stride=(1,), bias=False)
327
+ (relu): ReLU()
328
+ (norm_1): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
329
+ (dsconv): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), groups=512)
330
+ (prelu): PReLU(num_parameters=1)
331
+ (norm_2): SyncBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
332
+ (pw_conv): Conv1d(512, 256, kernel_size=(1,), stride=(1,), bias=False)
333
+ )
334
+ )
335
+ )
336
+ )
337
+
338
+ Total number of parameters: 68516407
339
+
340
+
341
+ Total number of trainable parameters: 57331303
342
+
343
+ dlcf4k2knsh01f6k-master-0:28:28 [0] NCCL INFO NCCL_SOCKET_IFNAME set by environment to eth
344
+ dlcf4k2knsh01f6k-master-0:28:28 [0] NCCL INFO Bootstrap : Using eth0:22.6.223.106<0>
345
+ dlcf4k2knsh01f6k-master-0:28:28 [0] NCCL INFO Plugin name set by env to libnccl-net-none.so
346
+ dlcf4k2knsh01f6k-master-0:28:28 [0] NCCL INFO NET/Plugin : dlerror=libnccl-net-none.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net-none.so), using internal implementation
347
+ dlcf4k2knsh01f6k-master-0:28:28 [0] NCCL INFO cudaDriverVersion 11040
348
+ dlcf4k2knsh01f6k-master-0:29:29 [1] NCCL INFO cudaDriverVersion 11040
349
+ NCCL version 2.20.5+cuda11.8
350
+ dlcf4k2knsh01f6k-master-0:29:29 [1] NCCL INFO NCCL_SOCKET_IFNAME set by environment to eth
351
+ dlcf4k2knsh01f6k-master-0:29:29 [1] NCCL INFO Bootstrap : Using eth0:22.6.223.106<0>
352
+ dlcf4k2knsh01f6k-master-0:29:29 [1] NCCL INFO Plugin name set by env to libnccl-net-none.so
353
+ dlcf4k2knsh01f6k-master-0:29:29 [1] NCCL INFO NET/Plugin : dlerror=libnccl-net-none.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net-none.so), using internal implementation
354
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO NCCL_SOCKET_IFNAME set by environment to eth
355
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO NCCL_SOCKET_IFNAME set by environment to eth
356
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO NCCL_IB_HCA set to mlx5
357
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO NCCL_IB_HCA set to mlx5
358
+ libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so': libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or directory
359
+ libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so': libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or directory
360
+ libibverbs: Warning: couldn't load driver 'librxe-rdmav25.so': librxe-rdmav25.so: cannot open shared object file: No such file or directory
361
+ libibverbs: Warning: couldn't load driver 'librxe-rdmav25.so': librxe-rdmav25.so: cannot open shared object file: No such file or directory
362
+ libibverbs: Warning: couldn't load driver 'libmthca-rdmav25.so': libmthca-rdmav25.so: cannot open shared object file: No such file or directory
363
+ libibverbs: Warning: couldn't load driver 'libmthca-rdmav25.so': libmthca-rdmav25.so: cannot open shared object file: No such file or directory
364
+ libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so': libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or directory
365
+ libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so': libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or directory
366
+ libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so': libhns-rdmav25.so: cannot open shared object file: No such file or directory
367
+ libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so': libhns-rdmav25.so: cannot open shared object file: No such file or directory
368
+ libibverbs: Warning: couldn't load driver 'libipathverbs-rdmav25.so': libipathverbs-rdmav25.so: cannot open shared object file: No such file or directory
369
+ libibverbs: Warning: couldn't load driver 'libipathverbs-rdmav25.so': libipathverbs-rdmav25.so: cannot open shared object file: No such file or directory
370
+ libibverbs: Warning: couldn't load driver 'libsiw-rdmav25.so': libsiw-rdmav25.so: cannot open shared object file: No such file or directory
371
+ libibverbs: Warning: couldn't load driver 'libsiw-rdmav25.so': libsiw-rdmav25.so: cannot open shared object file: No such file or directory
372
+ libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so': libbnxt_re-rdmav25.so: cannot open shared object file: No such file or directory
373
+ libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so': libbnxt_re-rdmav25.so: cannot open shared object file: No such file or directory
374
+ libibverbs: Warning: couldn't load driver 'libocrdma-rdmav25.so': libocrdma-rdmav25.so: cannot open shared object file: No such file or directory
375
+ libibverbs: Warning: couldn't load driver 'libocrdma-rdmav25.so': libocrdma-rdmav25.so: cannot open shared object file: No such file or directory
376
+ libibverbs: Warning: couldn't load driver 'libmlx4-rdmav25.so': libmlx4-rdmav25.so: cannot open shared object file: No such file or directory
377
+ libibverbs: Warning: couldn't load driver 'libmlx4-rdmav25.so': libmlx4-rdmav25.so: cannot open shared object file: No such file or directory
378
+ libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so': libqedr-rdmav25.so: cannot open shared object file: No such file or directory
379
+ libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so': libqedr-rdmav25.so: cannot open shared object file: No such file or directory
380
+ libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so': libcxgb4-rdmav25.so: cannot open shared object file: No such file or directory
381
+ libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so': libcxgb4-rdmav25.so: cannot open shared object file: No such file or directory
382
+ libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so': libi40iw-rdmav25.so: cannot open shared object file: No such file or directory
383
+ libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so': libi40iw-rdmav25.so: cannot open shared object file: No such file or directory
384
+ libibverbs: Warning: couldn't load driver 'libefa-rdmav25.so': libefa-rdmav25.so: cannot open shared object file: No such file or directory
385
+ libibverbs: Warning: couldn't load driver 'libefa-rdmav25.so': libefa-rdmav25.so: cannot open shared object file: No such file or directory
386
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [RO]; OOB eth0:22.6.223.106<0>
387
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Using non-device net plugin version 0
388
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Using network IB
389
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [RO]; OOB eth0:22.6.223.106<0>
390
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Using non-device net plugin version 0
391
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Using network IB
392
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO comm 0xf843110 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 10 commId 0x10c1230423267b73 - Init START
393
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO comm 0xdeaf580 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 20 commId 0x10c1230423267b73 - Init START
394
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Setting affinity for GPU 1 to 0fff
395
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Setting affinity for GPU 0 to 0fff
396
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO comm 0xdeaf580 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
397
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO comm 0xf843110 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
398
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO NCCL_MIN_NCHANNELS set by environment to 4.
399
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO NCCL_MIN_NCHANNELS set by environment to 4.
400
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 00/04 : 0 1
401
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] 0/-1/-1->1->-1 [2] -1/-1/-1->1->0 [3] 0/-1/-1->1->-1
402
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 01/04 : 0 1
403
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO P2P Chunksize set to 524288
404
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 02/04 : 0 1
405
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 03/04 : 0 1
406
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] -1/-1/-1->0->1 [2] 1/-1/-1->0->-1 [3] -1/-1/-1->0->1
407
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO P2P Chunksize set to 524288
408
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read
409
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read
410
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read
411
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read
412
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read
413
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read
414
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read
415
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read
416
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Connected all rings
417
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO Connected all trees
418
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Connected all rings
419
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512
420
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO 4 coll channels, 0 collnet channels, 0 nvls channels, 4 p2p channels, 4 p2p channels per peer
421
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO Connected all trees
422
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512
423
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO 4 coll channels, 0 collnet channels, 0 nvls channels, 4 p2p channels, 4 p2p channels per peer
424
+ dlcf4k2knsh01f6k-master-0:28:47 [0] NCCL INFO comm 0xf843110 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 10 commId 0x10c1230423267b73 - Init COMPLETE
425
+ dlcf4k2knsh01f6k-master-0:29:48 [1] NCCL INFO comm 0xdeaf580 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 20 commId 0x10c1230423267b73 - Init COMPLETE
426
+ Start new training from scratch
427
+ [rank0]:[W1112 09:29:26.544802480 reducer.cpp:1400] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
428
+ [rank1]:[W1112 09:29:26.544925757 reducer.cpp:1400] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
429
+ Train Summary | End of Epoch 1 | Time 43669.98s | Train Loss 1.247
430
+ Valid Summary | End of Epoch 1 | Time 901.08s | Valid Loss -1.256
431
+ Test Summary | End of Epoch 1 | Time 541.46s | Test Loss -1.199
432
+ Fund new best model, dict saved
433
+ Train Summary | End of Epoch 2 | Time 43713.98s | Train Loss -2.610
434
+ Valid Summary | End of Epoch 2 | Time 901.18s | Valid Loss -3.806
435
+ Test Summary | End of Epoch 2 | Time 540.83s | Test Loss -3.796
436
+ Fund new best model, dict saved
437
+ Train Summary | End of Epoch 3 | Time 25073.10s | Train Loss -4.489
438
+ Valid Summary | End of Epoch 3 | Time 450.91s | Valid Loss -5.026
439
+ Test Summary | End of Epoch 3 | Time 271.13s | Test Loss -5.058
440
+ Fund new best model, dict saved
441
+ Train Summary | End of Epoch 4 | Time 25054.36s | Train Loss -5.677
442
+ Valid Summary | End of Epoch 4 | Time 450.08s | Valid Loss -6.083
443
+ Test Summary | End of Epoch 4 | Time 270.47s | Test Loss -6.067
444
+ Fund new best model, dict saved
445
+ Train Summary | End of Epoch 5 | Time 25076.36s | Train Loss -6.544
446
+ Valid Summary | End of Epoch 5 | Time 450.20s | Valid Loss -6.711
447
+ Test Summary | End of Epoch 5 | Time 270.51s | Test Loss -6.693
448
+ Fund new best model, dict saved
449
+ Train Summary | End of Epoch 6 | Time 25075.03s | Train Loss -7.214
450
+ Valid Summary | End of Epoch 6 | Time 450.78s | Valid Loss -7.022
451
+ Test Summary | End of Epoch 6 | Time 270.56s | Test Loss -7.022
452
+ Fund new best model, dict saved
453
+ Train Summary | End of Epoch 7 | Time 25072.53s | Train Loss -7.771
454
+ Valid Summary | End of Epoch 7 | Time 450.92s | Valid Loss -7.599
455
+ Test Summary | End of Epoch 7 | Time 270.66s | Test Loss -7.569
456
+ Fund new best model, dict saved
457
+ Train Summary | End of Epoch 8 | Time 25091.66s | Train Loss -8.231
458
+ Valid Summary | End of Epoch 8 | Time 450.72s | Valid Loss -7.935
459
+ Test Summary | End of Epoch 8 | Time 270.46s | Test Loss -7.773
460
+ Fund new best model, dict saved
461
+ Train Summary | End of Epoch 9 | Time 25093.91s | Train Loss -8.638
462
+ Valid Summary | End of Epoch 9 | Time 450.64s | Valid Loss -8.063
463
+ Test Summary | End of Epoch 9 | Time 270.51s | Test Loss -7.896
464
+ Fund new best model, dict saved
465
+ Train Summary | End of Epoch 10 | Time 25096.69s | Train Loss -8.994
466
+ Valid Summary | End of Epoch 10 | Time 450.53s | Valid Loss -8.630
467
+ Test Summary | End of Epoch 10 | Time 270.49s | Test Loss -8.485
468
+ Fund new best model, dict saved
469
+ Train Summary | End of Epoch 11 | Time 25093.05s | Train Loss -9.327
470
+ Valid Summary | End of Epoch 11 | Time 450.49s | Valid Loss -8.803
471
+ Test Summary | End of Epoch 11 | Time 270.54s | Test Loss -8.571
472
+ Fund new best model, dict saved
473
+ Train Summary | End of Epoch 12 | Time 25090.36s | Train Loss -9.571
474
+ Valid Summary | End of Epoch 12 | Time 450.87s | Valid Loss -8.775
475
+ Test Summary | End of Epoch 12 | Time 270.66s | Test Loss -8.658
476
+ Train Summary | End of Epoch 13 | Time 25087.64s | Train Loss -9.798
477
+ Valid Summary | End of Epoch 13 | Time 450.32s | Valid Loss -9.016
478
+ Test Summary | End of Epoch 13 | Time 270.48s | Test Loss -8.783
479
+ Fund new best model, dict saved
480
+ Train Summary | End of Epoch 14 | Time 25089.46s | Train Loss -10.035
481
+ Valid Summary | End of Epoch 14 | Time 451.11s | Valid Loss -9.213
482
+ Test Summary | End of Epoch 14 | Time 271.00s | Test Loss -9.057
483
+ Fund new best model, dict saved
484
+ Train Summary | End of Epoch 15 | Time 25087.48s | Train Loss -10.248
485
+ Valid Summary | End of Epoch 15 | Time 450.58s | Valid Loss -9.285
486
+ Test Summary | End of Epoch 15 | Time 270.56s | Test Loss -9.162
487
+ Fund new best model, dict saved
488
+ Train Summary | End of Epoch 16 | Time 25086.70s | Train Loss -10.433
489
+ Valid Summary | End of Epoch 16 | Time 449.93s | Valid Loss -9.573
490
+ Test Summary | End of Epoch 16 | Time 270.22s | Test Loss -9.301
491
+ Fund new best model, dict saved
492
+ Train Summary | End of Epoch 17 | Time 25069.00s | Train Loss -10.590
493
+ Valid Summary | End of Epoch 17 | Time 450.28s | Valid Loss -9.660
494
+ Test Summary | End of Epoch 17 | Time 270.11s | Test Loss -9.497
495
+ Fund new best model, dict saved
496
+ Train Summary | End of Epoch 18 | Time 25070.59s | Train Loss -10.769
497
+ Valid Summary | End of Epoch 18 | Time 449.77s | Valid Loss -9.758
498
+ Test Summary | End of Epoch 18 | Time 270.33s | Test Loss -9.576
499
+ Fund new best model, dict saved
500
+ Train Summary | End of Epoch 19 | Time 25083.60s | Train Loss -10.908
501
+ Valid Summary | End of Epoch 19 | Time 450.18s | Valid Loss -9.806
502
+ Test Summary | End of Epoch 19 | Time 270.57s | Test Loss -9.576
503
+ Fund new best model, dict saved
504
+ Train Summary | End of Epoch 20 | Time 25077.18s | Train Loss -11.032
505
+ Valid Summary | End of Epoch 20 | Time 450.19s | Valid Loss -9.904
506
+ Test Summary | End of Epoch 20 | Time 270.30s | Test Loss -9.614
507
+ Fund new best model, dict saved
508
+ Train Summary | End of Epoch 21 | Time 25076.42s | Train Loss -11.159
509
+ Valid Summary | End of Epoch 21 | Time 450.44s | Valid Loss -9.932
510
+ Test Summary | End of Epoch 21 | Time 270.29s | Test Loss -9.735
511
+ Fund new best model, dict saved
512
+ Train Summary | End of Epoch 22 | Time 25071.05s | Train Loss -11.284
513
+ Valid Summary | End of Epoch 22 | Time 449.99s | Valid Loss -10.161
514
+ Test Summary | End of Epoch 22 | Time 270.03s | Test Loss -9.847
515
+ Fund new best model, dict saved
516
+ Train Summary | End of Epoch 23 | Time 25179.65s | Train Loss -11.413
517
+ Valid Summary | End of Epoch 23 | Time 451.74s | Valid Loss -10.168
518
+ Test Summary | End of Epoch 23 | Time 271.32s | Test Loss -10.012
519
+ Fund new best model, dict saved
520
+ Train Summary | End of Epoch 24 | Time 25184.85s | Train Loss -11.510
521
+ Valid Summary | End of Epoch 24 | Time 450.85s | Valid Loss -10.270
522
+ Test Summary | End of Epoch 24 | Time 270.98s | Test Loss -9.983
523
+ Fund new best model, dict saved
524
+ Train Summary | End of Epoch 25 | Time 25186.36s | Train Loss -11.605
525
+ Valid Summary | End of Epoch 25 | Time 451.40s | Valid Loss -10.422
526
+ Test Summary | End of Epoch 25 | Time 271.20s | Test Loss -10.065
527
+ Fund new best model, dict saved
528
+ Train Summary | End of Epoch 26 | Time 25201.47s | Train Loss -11.704
529
+ Valid Summary | End of Epoch 26 | Time 453.12s | Valid Loss -10.403
530
+ Test Summary | End of Epoch 26 | Time 272.61s | Test Loss -10.034
531
+ Train Summary | End of Epoch 27 | Time 25220.07s | Train Loss -11.780
532
+ Valid Summary | End of Epoch 27 | Time 451.85s | Valid Loss -10.491
533
+ Test Summary | End of Epoch 27 | Time 271.49s | Test Loss -10.278
534
+ Fund new best model, dict saved
535
+ Train Summary | End of Epoch 28 | Time 25198.71s | Train Loss -11.887
536
+ Valid Summary | End of Epoch 28 | Time 451.45s | Valid Loss -10.491
537
+ Test Summary | End of Epoch 28 | Time 271.39s | Test Loss -10.255
538
+ Fund new best model, dict saved
539
+ Train Summary | End of Epoch 29 | Time 25196.59s | Train Loss -11.961
540
+ Valid Summary | End of Epoch 29 | Time 452.05s | Valid Loss -10.628
541
+ Test Summary | End of Epoch 29 | Time 271.64s | Test Loss -10.229
542
+ Fund new best model, dict saved
543
+ Train Summary | End of Epoch 30 | Time 25177.23s | Train Loss -12.047
544
+ Valid Summary | End of Epoch 30 | Time 451.87s | Valid Loss -10.646
545
+ Test Summary | End of Epoch 30 | Time 271.62s | Test Loss -10.514
546
+ Fund new best model, dict saved
547
+ Train Summary | End of Epoch 31 | Time 25189.61s | Train Loss -12.115
548
+ Valid Summary | End of Epoch 31 | Time 451.54s | Valid Loss -10.731
549
+ Test Summary | End of Epoch 31 | Time 271.47s | Test Loss -10.316
550
+ Fund new best model, dict saved
551
+ Train Summary | End of Epoch 32 | Time 25192.95s | Train Loss -12.189
552
+ Valid Summary | End of Epoch 32 | Time 451.81s | Valid Loss -10.682
553
+ Test Summary | End of Epoch 32 | Time 271.43s | Test Loss -10.461
554
+ Train Summary | End of Epoch 33 | Time 25213.18s | Train Loss -12.261
555
+ Valid Summary | End of Epoch 33 | Time 452.28s | Valid Loss -10.742
556
+ Test Summary | End of Epoch 33 | Time 271.82s | Test Loss -10.376
557
+ Fund new best model, dict saved
558
+ Train Summary | End of Epoch 34 | Time 25239.52s | Train Loss -12.326
559
+ Valid Summary | End of Epoch 34 | Time 452.05s | Valid Loss -10.735
560
+ Test Summary | End of Epoch 34 | Time 271.85s | Test Loss -10.476
561
+ Train Summary | End of Epoch 35 | Time 25201.77s | Train Loss -12.387
562
+ Valid Summary | End of Epoch 35 | Time 452.37s | Valid Loss -10.790
563
+ Test Summary | End of Epoch 35 | Time 271.61s | Test Loss -10.545
564
+ Fund new best model, dict saved
565
+ Train Summary | End of Epoch 36 | Time 25214.39s | Train Loss -12.448
566
+ Valid Summary | End of Epoch 36 | Time 452.24s | Valid Loss -10.900
567
+ Test Summary | End of Epoch 36 | Time 271.32s | Test Loss -10.561
568
+ Fund new best model, dict saved
569
+ Train Summary | End of Epoch 37 | Time 25189.86s | Train Loss -12.525
570
+ Valid Summary | End of Epoch 37 | Time 451.71s | Valid Loss -10.881
571
+ Test Summary | End of Epoch 37 | Time 271.45s | Test Loss -10.628
572
+ Train Summary | End of Epoch 38 | Time 25183.78s | Train Loss -12.573
573
+ Valid Summary | End of Epoch 38 | Time 451.63s | Valid Loss -10.884
574
+ Test Summary | End of Epoch 38 | Time 271.73s | Test Loss -10.616
575
+ Train Summary | End of Epoch 39 | Time 25016.31s | Train Loss -12.616
576
+ Valid Summary | End of Epoch 39 | Time 450.59s | Valid Loss -10.945
577
+ Test Summary | End of Epoch 39 | Time 270.98s | Test Loss -10.626
578
+ Fund new best model, dict saved
579
+ Train Summary | End of Epoch 40 | Time 25025.16s | Train Loss -12.672
580
+ Valid Summary | End of Epoch 40 | Time 449.51s | Valid Loss -10.916
581
+ Test Summary | End of Epoch 40 | Time 270.11s | Test Loss -10.699
582
+ Train Summary | End of Epoch 41 | Time 24994.60s | Train Loss -12.720
583
+ Valid Summary | End of Epoch 41 | Time 449.37s | Valid Loss -10.990
584
+ Test Summary | End of Epoch 41 | Time 270.28s | Test Loss -10.662
585
+ Fund new best model, dict saved
586
+ Train Summary | End of Epoch 42 | Time 25011.89s | Train Loss -12.777
587
+ Valid Summary | End of Epoch 42 | Time 449.97s | Valid Loss -10.992
588
+ Test Summary | End of Epoch 42 | Time 270.13s | Test Loss -10.579
589
+ Fund new best model, dict saved
590
+ Train Summary | End of Epoch 43 | Time 25003.18s | Train Loss -12.816
591
+ Valid Summary | End of Epoch 43 | Time 449.46s | Valid Loss -11.100
592
+ Test Summary | End of Epoch 43 | Time 270.14s | Test Loss -10.774
593
+ Fund new best model, dict saved
594
+ Train Summary | End of Epoch 44 | Time 25008.71s | Train Loss -12.872
595
+ Valid Summary | End of Epoch 44 | Time 449.52s | Valid Loss -11.058
596
+ Test Summary | End of Epoch 44 | Time 270.12s | Test Loss -10.800
597
+ Train Summary | End of Epoch 45 | Time 25001.41s | Train Loss -12.915
598
+ Valid Summary | End of Epoch 45 | Time 449.84s | Valid Loss -11.000
599
+ Test Summary | End of Epoch 45 | Time 270.30s | Test Loss -10.702
600
+ Train Summary | End of Epoch 46 | Time 25044.45s | Train Loss -12.961
601
+ Valid Summary | End of Epoch 46 | Time 449.92s | Valid Loss -11.090
602
+ Test Summary | End of Epoch 46 | Time 270.63s | Test Loss -10.808
603
+ Train Summary | End of Epoch 47 | Time 25096.63s | Train Loss -13.003
604
+ Valid Summary | End of Epoch 47 | Time 450.40s | Valid Loss -11.108
605
+ Test Summary | End of Epoch 47 | Time 270.46s | Test Loss -10.729
606
+ Fund new best model, dict saved
607
+ Train Summary | End of Epoch 48 | Time 25066.48s | Train Loss -13.052
608
+ Valid Summary | End of Epoch 48 | Time 450.61s | Valid Loss -11.084
609
+ Test Summary | End of Epoch 48 | Time 270.58s | Test Loss -10.807
610
+ Train Summary | End of Epoch 49 | Time 25060.37s | Train Loss -13.081
611
+ Valid Summary | End of Epoch 49 | Time 449.97s | Valid Loss -11.180
612
+ Test Summary | End of Epoch 49 | Time 270.10s | Test Loss -10.787
613
+ Fund new best model, dict saved
614
+ Avg SISNR:i tensor([15.5134], device='cuda:0')
615
+ Avg SNRi: 15.982982163172778
616
+ Avg PESQi: 0.9377079456647237
617
+ Avg STOIi: 0.3805843040861404
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731374941.dlcf4k2knsh01f6k-master-0.28.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:03b88a1a8717ab1bfb7de8b7da6d0c74f3bcd3cc66acb1b454503c393becb878
3
+ size 384
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731483665.dlcf4k2knsh01f6k-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bcabdda15ce5d977fe7ee92a657542541518c7c02c24dcd6682a80cb12656321
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731487228.dlcf4k2knsh01f6k-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73a06be5c174f7ac57d165c3b995466916a10a0a6cc5a588996768b86e868388
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731492013.dlcf4k2knsh01f6k-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a87252d8a3672fe1cd328740d19bc87965d34221282468aad1f18527994f85c
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731546582.dlc199i687psn18d-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab592c940a0c4587bd3f6d3808883dcc4c0e948d4d46105c28a3a772cacbcbb9
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731548123.dlc9mw1l3osem0g9-master-0.1183013.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bfbc0bc93f4e00ea6f68dc3af86e916bfe6795e3d10072e2cbcfb000224d8572
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731548338.dlc9mw1l3osem0g9-master-0.1187202.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:011a3bd08cf6325d461eef7d830aa7b07e77f77751efbbd4b00ce88dde2ca762
3
+ size 88
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1731557306.dlc1yk2tc721dlue-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:05641bb5d193dd14babc4450c0aac41eb299ab64ad1035f0621804a728895b33
3
+ size 3048
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1732075802.dlc1g7yr5z0x4h2g-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9c8d8e2ff4cceb649d8a7ac6fe0ba2c19b8ba81ea36d3bb14aa9fd0a00588a2
3
+ size 2456
checkpoints/log_VoxCeleb2_lip_mossformer2_3spk/tensorboard/events.out.tfevents.1732583305.dlc1gzcv7row61qr-master-0.26.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3dbf2602a6c0dc68e378107dd8a606e67da383caff0a65179e37243954bff650
3
+ size 1716