The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input Paper • 2501.03200 • Published Jan 6 • 1
Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation Paper • 2505.00612 • Published May 1 • 9