Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning Paper • 2505.17813 • Published 16 days ago • 55 • 4
Running 2.67k 2.67k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Paper • 2206.04615 • Published Jun 9, 2022 • 5
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data Paper • 2410.01560 • Published Oct 2, 2024 • 4
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 13
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset Paper • 2504.16891 • Published Apr 23 • 21
OpenMathReasoning Collection Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 2 days ago • 40
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset Paper • 2504.16891 • Published Apr 23 • 21