Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 169
view post Post 1575 Looking for someone with +10 years of experience training Deep Kolmogorov-Arnold Networks.Any suggestions? 😔 4 4 🤯 2 2 🧠 2 2 + Reply
Wukong: Towards a Scaling Law for Large-Scale Recommendation Paper • 2403.02545 • Published Mar 4, 2024 • 17
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 29
Running on CPU Upgrade 13.1k 13.1k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots