NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates Feb 2, 2024 • 3
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 10 • 2
Game-theoretic LLM: Agent Workflow for Negotiation Games Paper • 2411.05990 • Published Nov 8, 2024 • 7 • 2