Generate detailed prompts for Stable Diffusion
Analyze image to generate descriptive prompt
Transform video frames using text instructions
Generate images from text descriptions
Generate audio from text