Deepseek Papers Collection Deepseek papers collection • 24 items • Updated about 21 hours ago • 269
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22 • 123
HelpSteer2: Open-source dataset for training top-performing reward models Paper • 2406.08673 • Published Jun 12, 2024 • 19
Standard-format-preference-dataset Collection We collect the open-source datasets and process them into the standard format. • 14 items • Updated May 8, 2024 • 25
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 21