Variational Autoencoder Conditioned Diffusion Model

This model is designed to generate music tracks based on input playlists by extracting the "taste" from the playlists using a combination of a Variational Autoencoder (VAE) and a conditioned diffusion model.

Model Details

  • VAE: Learns a compressed latent space representation of the input data, specifically mel spectrogram images of audio samples.
  • Diffusion Model: Generates new data points by progressively refining random noise into meaningful data, conditioned on the VAE's latent space.
Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train alppo/vae-conditioned-diffusion-model_v2