The Invisible Leash: Why RLVR May Not Escape Its Origin Paper ⢠2507.14843 ⢠Published Jul 20 ⢠84
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper ⢠2505.24864 ⢠Published May 30 ⢠136
massive-serve Collection One command to download and serve a datastore---that's it š. https://github.com/RulinShao/massive-serve ⢠8 items ⢠Updated Jun 6 ⢠2
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper ⢠2504.11393 ⢠Published Apr 15 ⢠18
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper ⢠2504.07096 ⢠Published Apr 9 ⢠77
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees Paper ⢠2503.08893 ⢠Published Mar 11 ⢠5
Establishing Task Scaling Laws via Compute-Efficient Model Ladders Paper ⢠2412.04403 ⢠Published Dec 5, 2024 ⢠3
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper ⢠2401.17377 ⢠Published Jan 30, 2024 ⢠38