HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 16 days ago • 40
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published 19 days ago • 40
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 56
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation Paper • 2510.04290 • Published Oct 5, 2025 • 19
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 117
Running on CPU Upgrade Featured 2.95k The Smol Training Playbook 📚 2.95k The secrets to building world-class LLMs