Chat with an AI assistant using text and images
You are the assistant
Generate speech from text using a reference audio sample