Transform images based on text instructions
Generate edited video frames using text prompts
Generate images from sketches and poses
Generate audio from text descriptions