MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 4 days ago • 258
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 1 day ago • 40
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 10 days ago • 230
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published 9 days ago • 36
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 30 days ago • 73
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published 30 days ago • 85
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper • 2501.01895 • Published 15 days ago • 48
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 17 days ago • 95