mattricesound commited on
Commit
ad7127b
Β·
verified Β·
1 Parent(s): ccbf851

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +245 -245
README.md CHANGED
@@ -1,245 +1,245 @@
1
- ---
2
- title: RemFx
3
- app_file: app.py
4
- sdk: gradio
5
- sdk_version: 3.41.2
6
- ---
7
- <div align="center">
8
-
9
- # RemFx
10
- General Purpose Audio Effect Removal
11
-
12
- [![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/1234.56789)
13
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LoLgL1YHzIQfILEayDmRUZzDZzJpD6rD)
14
- [![Dataset](https://zenodo.org/badge/DOI/10.5281/zenodo.8187288.svg)](https://zenodo.org/record/8187288)
15
- [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
16
-
17
-
18
- Listening examples can be found [here](https://csteinmetz1.github.io/RemFX/).
19
-
20
-
21
- <img width="450px" src="remfx-headline.jpg">
22
-
23
- ## Abstract
24
- </div>
25
-
26
- Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced <b>RemFX</b>, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific
27
- removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging.
28
-
29
- ```bibtex
30
- @inproceedings{rice2023remfx,
31
- title={General Purpose Audio Effect Removal},
32
- author={Rice, Matthew and Steinmetz, Christian J. and Fazekas, George and Reiss, Joshua D.},
33
- booktitle={IEEE Workshop on Applications of Signal Processing to Audio and Acoustics},
34
- year={2023}
35
- }
36
- ```
37
-
38
-
39
- ## Setup
40
- ```
41
- git clone https://github.com/mhrice/RemFx.git
42
- cd RemFx
43
- git submodule update --init --recursive
44
- pip install -e . ./umx
45
- pip install --no-deps hearbaseline
46
- ```
47
- Due to incompatabilities with hearbaseline's dependencies (namely numpy/numba) and our other packages, we need to install hearbaseline with no dependencies.
48
- <b>Please run the setup code before running any scripts.</b>
49
- All scripts should be launched from the top level after installing.
50
-
51
- ## Usage
52
- This repo can be used for many different tasks. Here are some examples. Ensure you have run the setup code before running any scripts.
53
-
54
- ### Run RemFX Detect on a single file
55
- Here we will attempt to detect, then remove effects that are present in an audio file. For the best results, use a file from our [evaluation dataset](https://zenodo.org/record/8187288). We support detection and removal of the following effects: chorus, delay, distortion, dynamic range compression, and reverb.
56
-
57
- First, we need to download the pytorch checkpoints from [zenodo](https://zenodo.org/record/8218621)
58
- ```
59
- scripts/download_ckpts.sh
60
- ```
61
- Then run the detect script. This repo contains an example file `example.wav` from our test dataset which contains 2 effects (chorus and delay) applied to a guitar.
62
- ```
63
- scripts/remfx_detect.sh example.wav -o dry.wav
64
- ```
65
- ### Download the [General Purpose Audio Effect Removal evaluation datasets](https://zenodo.org/record/8187288)
66
- We provide a script to download and unzip the datasets used in table 4 of the paper.
67
- ```
68
- scripts/download_eval_datasets.sh
69
- ```
70
-
71
- ### Download the starter datasets
72
-
73
- If you'd like to train your own model and/or generate a dataset, you can download the starter datasets using the following command:
74
-
75
- ```
76
- python scripts/download.py vocalset guitarset dsd100 idmt-smt-drums
77
- ```
78
- By default, the starter datasets are downloaded to `./data/remfx-data`. To change this, pass `--output_dir={path/to/datasets}` to `download.py`
79
-
80
- Then set the dataset root:
81
- ```
82
- export DATASET_ROOT={path/to/datasets}
83
- ```
84
-
85
- These starter datasets come from the following:
86
- - Vocals: [VocalSet](https://zenodo.org/record/1442513)
87
- - Guitars: [GuitarSet](https://zenodo.org/record/3371780)
88
- - Bass: [DSD100](https://sigsep.github.io/datasets/dsd100.html)
89
- - Drums: [IDMT-SMT-Drums](https://zenodo.org/record/7544164)
90
-
91
- ## Training
92
- Before training, it is important that you have downloaded the starter datasets (see above) and set `$DATASET_ROOT`.
93
- This project uses the [pytorch-lightning](https://www.pytorchlightning.ai/index.html) framework and [hydra](https://hydra.cc/) for configuration management. All experiments are defined in `cfg/exp/`. To train with an existing experiment run
94
- ```
95
- python scripts/train.py +exp={experiment_name}
96
- ```
97
-
98
- At the end of training, the train script will automatically evaluate the test set using the best checkpoint (by validation loss). If epoch 0 is not finished, it will throw an error. To evaluate a specific checkpoint, run
99
-
100
- ```
101
- python scripts/test.py +exp={experiment_name} +ckpt_path="{path/to/checkpoint}" render_files=False
102
- ```
103
-
104
- ### Experiments
105
- Here are some selected experiment types from the paper, which use different datasets and configurations. See `cfg/exp/` for a full list of experiments and parameters.
106
-
107
- | Experiment Type | Config Name | Example |
108
- | ----------------------- | ------------ | ----------------- |
109
- | Effect-specific | {effect} | +exp=chorus |
110
- | Effect-specific + FXAug | {effect}_aug | +exp=chorus_aug |
111
- | Monolithic (1 FX) | 5-1 | +exp=5-1 |
112
- | Monolithic (<=5 FX) | 5-5_full | +exp=5-5_full |
113
- | Classifier | 5-5_full_cls | +exp=5-5_full_cls |
114
-
115
- To change the configuration, simply edit the experiment file, or override the configuration on the command line. A description of some of these variables is in the Experimental parameters section below.
116
- You can also create a custom experiment by creating a new experiment file in `cfg/exp/` and overriding the default parameters in `config.yaml`.
117
-
118
- ### Logging
119
- By default, training uses the Pytorch Lightning CSV Logger
120
- Metrics and hyperparams will be logged in `./lightning_logs/{timestamp}`
121
-
122
- [Weights and Biases](https://wandb.ai/) logging can also be used, and will log audio during training and testing. To use Weights and Biases, set `logger=wandb` in the config or command-line. Make sure you have an account and are logged in.
123
-
124
- Then set the project and entity:
125
- ```
126
- export WANDB_PROJECT={desired_wandb_project}
127
- export WANDB_ENTITY={your_wandb_username}
128
- ```
129
-
130
- The checkpoints will be saved in `./logs/ckpts/{timestamp}`
131
-
132
- ### Misc.
133
- - By default, the dataset needed for the experiment is generated before training.
134
- If you have generated the dataset separately (see Generate datasets used in the paper), be sure to set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
135
-
136
- - Training assumes you have a CUDA GPU. To train on CPU, set `accelerator=null` in the config or command-line.
137
-
138
- - If training with the pretrained PANNs model, download the pretrained model from [here](https://zenodo.org/record/6332525) or run: `wget https://zenodo.org/record/6332525/files/hear2021-panns_hear.pth`. Place this in the root of the repo.
139
-
140
-
141
- ## Evaluate models on the General Purpose Audio Effect Removal evaluation datasets (Table 4 from the paper)
142
- We provide a way to replicate the results of table 4 from our paper. First download the <b>General Purpose Audio Effect Removal evaluation datasets</b> (see above).
143
- To use the pretrained RemFX model, download the checkpoints:
144
- ```
145
- scripts/download_ckpts.sh
146
- ```
147
- Then run the evaluation script. First select the RemFX configuration, between `remfx_oracle`, `remfx_detect`, and `remfx_all`. As a reminder, `remfx_oracle` uses the ground truth labels of the present effects to determine which removal models to apply, `remfx_detect` detects which effects are present, and `remfx_all` assumes all effects are present.
148
- ```
149
- scripts/eval.sh remfx_detect 0-0
150
- scripts/eval.sh remfx_detect 1-1
151
- scripts/eval.sh remfx_detect 2-2
152
- scripts/eval.sh remfx_detect 3-3
153
- scripts/eval.sh remfx_detect 4-4
154
- scripts/eval.sh remfx_detect 5-5
155
- ```
156
- In this case the `N-N` refers to the number of effects present for each example in the dataset.
157
-
158
-
159
- To eval a custom monolithic model, first train a model (see Training)
160
- Then run the evaluation script, with the config used and checkpoint_path.
161
- ```
162
- scripts/eval.sh distortion_aug 0-0 -ckpt "{path/to/checkpoint}"
163
- ```
164
-
165
- To eval a custom effect-specific model as part of the inference chain, first train a model (see Training), then edit `cfg/exp/remfx_{desired_configuration}.yaml -> ckpts -> {effect}`. Select between `remfx_detect`, `remfx_oracle`, and `remfx_all`.
166
- Then run the evaluation script.
167
- ```
168
- scripts/eval.sh remfx_detect 0-0
169
- ```
170
-
171
- The script assumes that RemFX_eval_datasets is in the top-level directory.
172
- Metrics and hyperparams will be logged in `./lightning_logs/{timestamp}`
173
-
174
- ## Generate other datasets
175
- The datasets used in the experiments are customly generated from the starter datasets. In short, for each training/val/testing example, we select a random 5.5s segment from one of the starter datasets and apply a random number of effects to it. The number of effects applied is controlled by the `num_kept_effects` and `num_removed_effects` parameters. The effects applied are controlled by the `effects_to_keep` and `effects_to_remove` parameters.
176
-
177
- Before generating datasets, it is important that you have downloaded the starter datasets (see above) and set `$DATASET_ROOT`.
178
-
179
- To generate one of the datasets used in the paper, use of the experiments defined in `cfg/exp/`.
180
- For example, to generate the `chorus` FXAug dataset, which includes files with 5 possible effects, up to 4 kept effects (distortion, reverb, compression, delay), and 1 removed effects (chorus), run
181
- ```
182
- python scripts/generate_dataset.py +exp=chorus_aug
183
- ```
184
-
185
- See the Experimental parameters section below for a description of the parameters.
186
- By default, files are rendered to `{render_root} / processed / {string_of_effects} / {train|val|test}`.
187
-
188
- The dataset that is generated contains 8000 train examples, 1000 validation examples, and 1000 test examples. Each example is contained in a folder labeled by its id number (ex. 0-7999 for train examples) with 4 files like so:
189
- ```
190
- .
191
- └── train
192
- β”œβ”€β”€ 0
193
- β”‚Β Β  β”œβ”€β”€ dry_effects.pt
194
- β”‚Β Β  β”œβ”€β”€ input.wav
195
- β”‚Β Β  β”œβ”€β”€ target.wav
196
- β”‚Β Β  └── wet_effects.pt
197
- β”œβ”€β”€ 1
198
- β”‚Β Β  └── ...
199
- β”œβ”€β”€ ...
200
- β”œβ”€β”€ 7999
201
- β”‚Β Β  └── ...
202
- ```
203
- ### File descriptions
204
- - dry_effects.pt = serialized PyTorch file that contains a list of the effects applied to the dry audio file
205
- - input.wav = the wet audio file
206
- - target.wav = the dry audio file
207
- - wet_effects.pt = serialized PyTorch file that contains a list of the effects applied to the wet audio file
208
-
209
- The effects list is in the order of Reverb, Chorus, Delay, Distortion, Compressor
210
-
211
- Note: if training, this process will be done automatically at the start of training. To disable this, set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
212
-
213
-
214
- ## Experimental parameters
215
- Some relevant dataset/training parameters descriptions
216
- - `num_kept_effects={[min, max]}` range of <b> Kept </b> effects to apply to each file. Inclusive.
217
- - `num_removed_effects={[min, max]}` range of <b> Removed </b> effects to apply to each file. Inclusive.
218
- - `model={model}` architecture to use (see 'Effect Removal Models/Effect Classification Models').
219
- - `effects_to_keep={[effect]}` Effects to apply but not remove (see 'Effects'). Used for FXAug.
220
- - `effects_to_remove={[effect]}` Effects to remove (see 'Effects').
221
- - `accelerator=null/'gpu'` Use GPU (1 device) (default: null).
222
- - `render_files=True/False` Render files. Disable to skip rendering stage (default: True).
223
- - `render_root={path/to/dir}`. Root directory to render files to (default: ./data).
224
- - `datamodule.train_batch_size={batch_size}`. Change batch size (default: varies).
225
- - `logger=wandb`. Use weights and biases logger (default: csv). Ensure you set the wandb environment variables (see training section).
226
-
227
- ### Effect Removal Models
228
- - `umx`
229
- - `demucs`
230
- - `tcn`
231
- - `dcunet`
232
- - `dptnet`
233
-
234
- ### Effect Classification Models
235
- - `cls_vggish`
236
- - `cls_panns_pt`
237
- - `cls_wav2vec2`
238
- - `cls_wav2clip`
239
-
240
- ### Effects
241
- - `delay`
242
- - `distortion`
243
- - `chorus`
244
- - `compressor`
245
- - `reverb`
 
1
+ ---
2
+ title: RemFx
3
+ app_file: app.py
4
+ sdk: gradio
5
+ sdk_version: 5.34.1
6
+ ---
7
+ <div align="center">
8
+
9
+ # RemFx
10
+ General Purpose Audio Effect Removal
11
+
12
+ [![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/1234.56789)
13
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LoLgL1YHzIQfILEayDmRUZzDZzJpD6rD)
14
+ [![Dataset](https://zenodo.org/badge/DOI/10.5281/zenodo.8187288.svg)](https://zenodo.org/record/8187288)
15
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
16
+
17
+
18
+ Listening examples can be found [here](https://csteinmetz1.github.io/RemFX/).
19
+
20
+
21
+ <img width="450px" src="remfx-headline.jpg">
22
+
23
+ ## Abstract
24
+ </div>
25
+
26
+ Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced <b>RemFX</b>, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific
27
+ removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging.
28
+
29
+ ```bibtex
30
+ @inproceedings{rice2023remfx,
31
+ title={General Purpose Audio Effect Removal},
32
+ author={Rice, Matthew and Steinmetz, Christian J. and Fazekas, George and Reiss, Joshua D.},
33
+ booktitle={IEEE Workshop on Applications of Signal Processing to Audio and Acoustics},
34
+ year={2023}
35
+ }
36
+ ```
37
+
38
+
39
+ ## Setup
40
+ ```
41
+ git clone https://github.com/mhrice/RemFx.git
42
+ cd RemFx
43
+ git submodule update --init --recursive
44
+ pip install -e . ./umx
45
+ pip install --no-deps hearbaseline
46
+ ```
47
+ Due to incompatabilities with hearbaseline's dependencies (namely numpy/numba) and our other packages, we need to install hearbaseline with no dependencies.
48
+ <b>Please run the setup code before running any scripts.</b>
49
+ All scripts should be launched from the top level after installing.
50
+
51
+ ## Usage
52
+ This repo can be used for many different tasks. Here are some examples. Ensure you have run the setup code before running any scripts.
53
+
54
+ ### Run RemFX Detect on a single file
55
+ Here we will attempt to detect, then remove effects that are present in an audio file. For the best results, use a file from our [evaluation dataset](https://zenodo.org/record/8187288). We support detection and removal of the following effects: chorus, delay, distortion, dynamic range compression, and reverb.
56
+
57
+ First, we need to download the pytorch checkpoints from [zenodo](https://zenodo.org/record/8218621)
58
+ ```
59
+ scripts/download_ckpts.sh
60
+ ```
61
+ Then run the detect script. This repo contains an example file `example.wav` from our test dataset which contains 2 effects (chorus and delay) applied to a guitar.
62
+ ```
63
+ scripts/remfx_detect.sh example.wav -o dry.wav
64
+ ```
65
+ ### Download the [General Purpose Audio Effect Removal evaluation datasets](https://zenodo.org/record/8187288)
66
+ We provide a script to download and unzip the datasets used in table 4 of the paper.
67
+ ```
68
+ scripts/download_eval_datasets.sh
69
+ ```
70
+
71
+ ### Download the starter datasets
72
+
73
+ If you'd like to train your own model and/or generate a dataset, you can download the starter datasets using the following command:
74
+
75
+ ```
76
+ python scripts/download.py vocalset guitarset dsd100 idmt-smt-drums
77
+ ```
78
+ By default, the starter datasets are downloaded to `./data/remfx-data`. To change this, pass `--output_dir={path/to/datasets}` to `download.py`
79
+
80
+ Then set the dataset root:
81
+ ```
82
+ export DATASET_ROOT={path/to/datasets}
83
+ ```
84
+
85
+ These starter datasets come from the following:
86
+ - Vocals: [VocalSet](https://zenodo.org/record/1442513)
87
+ - Guitars: [GuitarSet](https://zenodo.org/record/3371780)
88
+ - Bass: [DSD100](https://sigsep.github.io/datasets/dsd100.html)
89
+ - Drums: [IDMT-SMT-Drums](https://zenodo.org/record/7544164)
90
+
91
+ ## Training
92
+ Before training, it is important that you have downloaded the starter datasets (see above) and set `$DATASET_ROOT`.
93
+ This project uses the [pytorch-lightning](https://www.pytorchlightning.ai/index.html) framework and [hydra](https://hydra.cc/) for configuration management. All experiments are defined in `cfg/exp/`. To train with an existing experiment run
94
+ ```
95
+ python scripts/train.py +exp={experiment_name}
96
+ ```
97
+
98
+ At the end of training, the train script will automatically evaluate the test set using the best checkpoint (by validation loss). If epoch 0 is not finished, it will throw an error. To evaluate a specific checkpoint, run
99
+
100
+ ```
101
+ python scripts/test.py +exp={experiment_name} +ckpt_path="{path/to/checkpoint}" render_files=False
102
+ ```
103
+
104
+ ### Experiments
105
+ Here are some selected experiment types from the paper, which use different datasets and configurations. See `cfg/exp/` for a full list of experiments and parameters.
106
+
107
+ | Experiment Type | Config Name | Example |
108
+ | ----------------------- | ------------ | ----------------- |
109
+ | Effect-specific | {effect} | +exp=chorus |
110
+ | Effect-specific + FXAug | {effect}_aug | +exp=chorus_aug |
111
+ | Monolithic (1 FX) | 5-1 | +exp=5-1 |
112
+ | Monolithic (<=5 FX) | 5-5_full | +exp=5-5_full |
113
+ | Classifier | 5-5_full_cls | +exp=5-5_full_cls |
114
+
115
+ To change the configuration, simply edit the experiment file, or override the configuration on the command line. A description of some of these variables is in the Experimental parameters section below.
116
+ You can also create a custom experiment by creating a new experiment file in `cfg/exp/` and overriding the default parameters in `config.yaml`.
117
+
118
+ ### Logging
119
+ By default, training uses the Pytorch Lightning CSV Logger
120
+ Metrics and hyperparams will be logged in `./lightning_logs/{timestamp}`
121
+
122
+ [Weights and Biases](https://wandb.ai/) logging can also be used, and will log audio during training and testing. To use Weights and Biases, set `logger=wandb` in the config or command-line. Make sure you have an account and are logged in.
123
+
124
+ Then set the project and entity:
125
+ ```
126
+ export WANDB_PROJECT={desired_wandb_project}
127
+ export WANDB_ENTITY={your_wandb_username}
128
+ ```
129
+
130
+ The checkpoints will be saved in `./logs/ckpts/{timestamp}`
131
+
132
+ ### Misc.
133
+ - By default, the dataset needed for the experiment is generated before training.
134
+ If you have generated the dataset separately (see Generate datasets used in the paper), be sure to set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
135
+
136
+ - Training assumes you have a CUDA GPU. To train on CPU, set `accelerator=null` in the config or command-line.
137
+
138
+ - If training with the pretrained PANNs model, download the pretrained model from [here](https://zenodo.org/record/6332525) or run: `wget https://zenodo.org/record/6332525/files/hear2021-panns_hear.pth`. Place this in the root of the repo.
139
+
140
+
141
+ ## Evaluate models on the General Purpose Audio Effect Removal evaluation datasets (Table 4 from the paper)
142
+ We provide a way to replicate the results of table 4 from our paper. First download the <b>General Purpose Audio Effect Removal evaluation datasets</b> (see above).
143
+ To use the pretrained RemFX model, download the checkpoints:
144
+ ```
145
+ scripts/download_ckpts.sh
146
+ ```
147
+ Then run the evaluation script. First select the RemFX configuration, between `remfx_oracle`, `remfx_detect`, and `remfx_all`. As a reminder, `remfx_oracle` uses the ground truth labels of the present effects to determine which removal models to apply, `remfx_detect` detects which effects are present, and `remfx_all` assumes all effects are present.
148
+ ```
149
+ scripts/eval.sh remfx_detect 0-0
150
+ scripts/eval.sh remfx_detect 1-1
151
+ scripts/eval.sh remfx_detect 2-2
152
+ scripts/eval.sh remfx_detect 3-3
153
+ scripts/eval.sh remfx_detect 4-4
154
+ scripts/eval.sh remfx_detect 5-5
155
+ ```
156
+ In this case the `N-N` refers to the number of effects present for each example in the dataset.
157
+
158
+
159
+ To eval a custom monolithic model, first train a model (see Training)
160
+ Then run the evaluation script, with the config used and checkpoint_path.
161
+ ```
162
+ scripts/eval.sh distortion_aug 0-0 -ckpt "{path/to/checkpoint}"
163
+ ```
164
+
165
+ To eval a custom effect-specific model as part of the inference chain, first train a model (see Training), then edit `cfg/exp/remfx_{desired_configuration}.yaml -> ckpts -> {effect}`. Select between `remfx_detect`, `remfx_oracle`, and `remfx_all`.
166
+ Then run the evaluation script.
167
+ ```
168
+ scripts/eval.sh remfx_detect 0-0
169
+ ```
170
+
171
+ The script assumes that RemFX_eval_datasets is in the top-level directory.
172
+ Metrics and hyperparams will be logged in `./lightning_logs/{timestamp}`
173
+
174
+ ## Generate other datasets
175
+ The datasets used in the experiments are customly generated from the starter datasets. In short, for each training/val/testing example, we select a random 5.5s segment from one of the starter datasets and apply a random number of effects to it. The number of effects applied is controlled by the `num_kept_effects` and `num_removed_effects` parameters. The effects applied are controlled by the `effects_to_keep` and `effects_to_remove` parameters.
176
+
177
+ Before generating datasets, it is important that you have downloaded the starter datasets (see above) and set `$DATASET_ROOT`.
178
+
179
+ To generate one of the datasets used in the paper, use of the experiments defined in `cfg/exp/`.
180
+ For example, to generate the `chorus` FXAug dataset, which includes files with 5 possible effects, up to 4 kept effects (distortion, reverb, compression, delay), and 1 removed effects (chorus), run
181
+ ```
182
+ python scripts/generate_dataset.py +exp=chorus_aug
183
+ ```
184
+
185
+ See the Experimental parameters section below for a description of the parameters.
186
+ By default, files are rendered to `{render_root} / processed / {string_of_effects} / {train|val|test}`.
187
+
188
+ The dataset that is generated contains 8000 train examples, 1000 validation examples, and 1000 test examples. Each example is contained in a folder labeled by its id number (ex. 0-7999 for train examples) with 4 files like so:
189
+ ```
190
+ .
191
+ └── train
192
+ β”œβ”€β”€ 0
193
+ β”‚Β Β  β”œβ”€β”€ dry_effects.pt
194
+ β”‚Β Β  β”œβ”€β”€ input.wav
195
+ β”‚Β Β  β”œβ”€β”€ target.wav
196
+ β”‚Β Β  └── wet_effects.pt
197
+ β”œβ”€β”€ 1
198
+ β”‚Β Β  └── ...
199
+ β”œβ”€β”€ ...
200
+ β”œβ”€β”€ 7999
201
+ β”‚Β Β  └── ...
202
+ ```
203
+ ### File descriptions
204
+ - dry_effects.pt = serialized PyTorch file that contains a list of the effects applied to the dry audio file
205
+ - input.wav = the wet audio file
206
+ - target.wav = the dry audio file
207
+ - wet_effects.pt = serialized PyTorch file that contains a list of the effects applied to the wet audio file
208
+
209
+ The effects list is in the order of Reverb, Chorus, Delay, Distortion, Compressor
210
+
211
+ Note: if training, this process will be done automatically at the start of training. To disable this, set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
212
+
213
+
214
+ ## Experimental parameters
215
+ Some relevant dataset/training parameters descriptions
216
+ - `num_kept_effects={[min, max]}` range of <b> Kept </b> effects to apply to each file. Inclusive.
217
+ - `num_removed_effects={[min, max]}` range of <b> Removed </b> effects to apply to each file. Inclusive.
218
+ - `model={model}` architecture to use (see 'Effect Removal Models/Effect Classification Models').
219
+ - `effects_to_keep={[effect]}` Effects to apply but not remove (see 'Effects'). Used for FXAug.
220
+ - `effects_to_remove={[effect]}` Effects to remove (see 'Effects').
221
+ - `accelerator=null/'gpu'` Use GPU (1 device) (default: null).
222
+ - `render_files=True/False` Render files. Disable to skip rendering stage (default: True).
223
+ - `render_root={path/to/dir}`. Root directory to render files to (default: ./data).
224
+ - `datamodule.train_batch_size={batch_size}`. Change batch size (default: varies).
225
+ - `logger=wandb`. Use weights and biases logger (default: csv). Ensure you set the wandb environment variables (see training section).
226
+
227
+ ### Effect Removal Models
228
+ - `umx`
229
+ - `demucs`
230
+ - `tcn`
231
+ - `dcunet`
232
+ - `dptnet`
233
+
234
+ ### Effect Classification Models
235
+ - `cls_vggish`
236
+ - `cls_panns_pt`
237
+ - `cls_wav2vec2`
238
+ - `cls_wav2clip`
239
+
240
+ ### Effects
241
+ - `delay`
242
+ - `distortion`
243
+ - `chorus`
244
+ - `compressor`
245
+ - `reverb`