Upload an image to generate audio context
Describe images using text
Converts an image's text into an audio output