Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17 • 49
Mathesis: Towards Formal Theorem Proving from Natural Languages Paper • 2506.07047 • Published Jun 8 • 5
Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability Paper • 2506.08300 • Published Jun 10 • 8
Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning Paper • 2506.09501 • Published Jun 11 • 18
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 68
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Paper • 2505.04528 • Published May 7 • 12
Formal Problem-Solving Collection This collection is part of the official implementation of Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving. • 5 items • Updated May 8 • 3
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving Paper • 2504.15780 • Published Apr 22 • 6
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation Paper • 2502.19414 • Published Feb 26 • 20
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published Feb 20 • 48
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published Feb 25 • 75
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 414
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 99
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 84
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering Paper • 2411.11504 • Published Nov 18, 2024 • 24