Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 19 days ago • 439
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Paper • 2504.10449 • Published Apr 14 • 15
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated Sep 21 • 29
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Paper • 2508.18672 • Published Aug 26 • 10
view article Article NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset By nvidia and 4 others • Aug 20 • 18
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14 • 59
EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes Paper • 2507.11407 • Published Jul 15 • 57
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • Aug 11 • 74