LLMs for Reasoning
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing