STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving Paper • 2502.00212 • Published Jan 31 • 1
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Paper • 2410.01679 • Published Oct 2, 2024 • 25
Repeat After Me: Transformers are Better than State Space Models at Copying Paper • 2402.01032 • Published Feb 1, 2024 • 25