The model consumes a lot of resources for inference, so its inference code was modified to split the input into smaller parts, which results in longer inference time.
The model might not have been fully trained, so the quality may not be optimal. It is currently paused at checkpoint 22080.
The djcm.pt model has been trimmed by removing to_wav since it is not used.
Training with the original DJCM code (WIN LENGTH 2048 and HOP LENGTH 320)
index | sdr±std | gnsdr | rpa±std | rca±std | oa±std |
---|---|---|---|---|---|
414 | 5.83±2.09 | 2.82 | 81.52±13.59 | 84.06±10.77 | 78.85±10.99 |
828 | 7.78±1.86 | 4.77 | 90.73±7.32 | 91.91±6.13 | 88.2±6.61 |
1242 | 8.67±1.71 | 5.65 | 90.65±7.33 | 91.37±6.3 | 89.57±6.09 |
1656 | 8.94±1.82 | 5.92 | 92.65±5.61 | 93.14±5.09 | 87.16±6.68 |
2484 | 10.16±2.05 | 7.15 | 92.17±5.9 | 92.42±5.67 | 91.04±5.16 |
2898 | 10.03±1.92 | 7.02 | 92.74±5.74 | 93.08±5.57 | 89.4±5.59 |
3312 | 10.56±1.83 | 7.55 | 92.85±5.57 | 93.18±5.25 | 91.85±4.82 |
4554 | 11.28±2.19 | 8.27 | 93.57±5.42 | 93.81±5.18 | 92.07±4.62 |
5796 | 11.27±2.06 | 8.25 | 93.59±5.53 | 93.97±5.0 | 91.65±5.08 |
6624 | 11.57±2.05 | 8.56 | 93.86±5.16 | 94.04±4.98 | *92.93±4.26 |
7038 | 11.63±1.98 | 8.62 | 94.05±4.82 | 94.29±4.44 | 91.79±4.63 |
7452 | 11.0±1.88 | 7.99 | 93.43±5.39 | 93.68±4.89 | 91.48±5.16 |
7866 | 11.55±2.22 | 8.54 | 93.15±5.43 | 93.37±5.14 | 91.62±4.88 |
8280 | 11.76±2.2 | 8.75 | 93.7±5.26 | 93.88±4.99 | 92.62±4.35 |
8694 | *12.25±2.16 | *9.23 | 93.32±5.45 | 93.49±5.15 | 92.55±4.3 |
9086 | 9.13±1.34 | 6.12 | 94.76±4.0 | 95.16±3.56 | 70.17±9.26 |
9499 | 8.75±1.19 | 5.74 | 95.06±3.76 | 95.49±3.4 | 71.86±9.23 |
9912 | 9.16±1.16 | 6.15 | 95.48±3.68 | 95.73±3.46 | 73.65±8.35 |
10325 | 8.66±0.97 | 5.65 | 95.35±3.6 | 95.72±3.06 | 74.58±7.88 |
10738 | 8.2±0.85 | 5.19 | *95.58±3.52 | *95.87±3.23 | 75.25±7.87 |
Training with my modified DJCM code (WIN LENGTH 1024 and HOP LENGTH 160) With a mixture of 5 datasets, each set is extracted 20% for testing.
index | sdr±std | gnsdr | rpa±std | rca±std | oa±std |
---|---|---|---|---|---|
11776 | NONE | NONE | 88.87±7.25 | 89.9±6.74 | 34.2±22.21 |
14720 | NONE | NONE | 89.09±7.47 | 90.15±6.88 | 39.83±24.26 |
17664 | NONE | NONE | 89.47±7.16 | 90.44±6.65 | 37.56±25.77 |
20608 | NONE | NONE | 90.49±6.63 | 91.29±6.24 | 37.71±25.67 |
22080 | NONE | NONE | *90.65±6.69 | *91.56±6.15 | *41.01±24.69 |
Training Dataset: https://huggingface.co/datasets/AnhP/Mir-1k-use-DJCM-training/resolve/main/dataset.zip
Inference Code: https://github.com/PhamHuynhAnh16/DJCM/blob/main/inference.py
Training Code: https://github.com/PhamHuynhAnh16/DJCM/blob/main/train.py
Export Small Code: https://github.com/PhamHuynhAnh16/DJCM/blob/main/export_small_model.py
Export Onnx Code: https://github.com/PhamHuynhAnh16/DJCM/blob/main/export_onnx.py