high-quality Chinese training datasets Collection a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets. β’ 12 items β’ Updated 1 day ago β’ 8
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI β’ 3 days ago β’ 32
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper β’ 2501.08313 β’ Published 4 days ago β’ 256
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper β’ 2501.07301 β’ Published 5 days ago β’ 72
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper β’ 2501.06458 β’ Published 7 days ago β’ 29
Magpie Reasoning Datasets Collection Reasoning datasets built by Magpie and its friends! β’ 6 items β’ Updated 5 days ago β’ 8
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper β’ 2412.18619 β’ Published Dec 16, 2024 β’ 54
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper β’ 2501.03895 β’ Published 11 days ago β’ 48
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper β’ 2501.04001 β’ Published 11 days ago β’ 40
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 10 days ago β’ 230
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper β’ 2501.00958 β’ Published 16 days ago β’ 95
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper β’ 2501.01895 β’ Published 15 days ago β’ 48
YuLan-Mini: An Open Data-efficient Language Model Paper β’ 2412.17743 β’ Published 26 days ago β’ 64
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper β’ 2412.18925 β’ Published 24 days ago β’ 94