Generate a talking face video from an image and audio
Generate voice from text using ElevenLabs
Generate edited video frames using text prompts