OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper ⢠2504.07096 ⢠Published Apr 9 ⢠74
Don't throw away your value model! Making PPO even better via Value-Guided Monte-Carlo Tree Search decoding Paper ⢠2309.15028 ⢠Published Sep 26, 2023 ⢠1
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts Paper ⢠2310.02255 ⢠Published Oct 3, 2023 ⢠2
Crystal: Introspective Reasoners Reinforced with Self-Feedback Paper ⢠2310.04921 ⢠Published Oct 7, 2023 ⢠1
NaturalProofs: Mathematical Theorem Proving in Natural Language Paper ⢠2104.01112 ⢠Published Mar 24, 2021
Generated Knowledge Prompting for Commonsense Reasoning Paper ⢠2110.08387 ⢠Published Oct 15, 2021
Minds versus Machines: Rethinking Entailment Verification with Language Models Paper ⢠2402.03686 ⢠Published Feb 6, 2024 ⢠1
NaturalProver: Grounded Mathematical Proof Generation with Language Models Paper ⢠2205.12910 ⢠Published May 25, 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering Paper ⢠2210.03078 ⢠Published Oct 6, 2022 ⢠1
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Paper ⢠2406.09279 ⢠Published Jun 13, 2024 ⢠3
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text Paper ⢠2410.04265 ⢠Published Oct 5, 2024
Establishing Task Scaling Laws via Compute-Efficient Model Ladders Paper ⢠2412.04403 ⢠Published Dec 5, 2024 ⢠3
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper ⢠2401.17377 ⢠Published Jan 30, 2024 ⢠38
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements Paper ⢠2305.03695 ⢠Published May 5, 2023 ⢠4