https://twitter.com/_lyraaaa_/status/1819145905972691227
model config is identical to the stock stable_audio_2.0_vae included in the stable-audio-tools repo
finetuned stable audio open's vae for 100k steps to try and fix its habit of colorizing gritty sounds
the blue and orange runs are near-identical, same seed etc, except the orange one had the encoder and bottleneck frozen while blue was a full train. orange model has an identical latent space and therefore is instantly swappable into any stable audio open model, blue will require further training in exchange for slightly higher fidelity.
to use the blue vae, pass it to your train command with --pretransform-ckpt-path. to use the orange vae, you'll need to load the stable audio open model (with original vae), load the new vae, and then replace model.pretransform.model with it.
further instructions may be written at some point, but i highly recommend you play with the code and figure it out yourself!
Model tree for bleepybloops/sao_vae_tuned_100k
Base model
stabilityai/stable-audio-open-1.0