license: mit | |
tags: | |
- audio-generation | |
library_name: diffusers | |
base_model: harmonai/jmann-small-190k | |
Blunstron is a model I made for Harmonai's Dance Diffusion. | |
The dataset is less than five minutes of the song Old and Wise by The Alan Parsons Project, yet it performs very well and does not overfit. | |
Old and Wise is sung by Colin Blunstone, hence the name Blunstron. | |
# Why | |
I put music out for free on YouTube containing lots of tiny samples (in imitation of a musician named Todd Edwards), | |
and I've sampled Old and Wise a LOT because I like the auditory textures in it. I'm kind of running out of potential chops, | |
so I decided to generate a practically infinite supply of them with AI. | |
# How | |
I finetuned this on Google Colab for around two hours. I kind of dislike how the word "finetune" is used for Dance Diffusion, since unlike | |
Dreambooth with Stable Diffusion, Dance Diffusion models (including this one) effectively | |
become an entirely different model when fine-tuned for long enough. | |
# Audio Characteristics | |
- Deep, ethereal, soft auditory texture | |
- Mostly chords, not much melody | |
- Colin Blunstone-like vocals | |
- Occasional drum hits | |
A few examples are provided in the files of this git repo called blunstron-test-x.wav. |