TempoSum: Evaluating the Temporal Generalization of Abstractive Summarization Paper • 2305.01951 • Published May 3, 2023
CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries Paper • 2501.01282 • Published Jan 2
Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective Paper • 2505.19815 • Published May 26 • 37
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Paper • 2503.24377 • Published Mar 31 • 17
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Paper • 2407.08733 • Published Jul 11, 2024 • 23