τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment Paper • 2506.07982 • Published 5 days ago • 4