Generate text based on prompts
Identify and describe human poses in images
Start a simulated robot arm animation
Generate speech from text using gTTS or Edge TTS
Detect objects in images with multiple models