view article Article Agent Leaderboard: Evaluating AI Agents in Multi-Domain Scenarios By pratikbhavsar and 1 other • Feb 12 • 22
M$^3$FinMeeting: A Multilingual, Multi-Sector, and Multi-Task Financial Meeting Understanding Evaluation Dataset Paper • 2506.02510 • Published 12 days ago • 3
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models Paper • 2504.15716 • Published Apr 22 • 10