Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 70
view article Article What is MoE 2.0? Update Your Knowledge about Mixture-of-experts By Kseniase and 1 other • Apr 27 • 9