Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published Jan 7 • 76
Physical AI Collection Collection of commercial-grade datasets for physical AI developers • 10 items • Updated 4 days ago • 34
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 118
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated 17 days ago • 68
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 123
Theia Collection Distilling Diverse Vision Foundation Models for Robot Learning • 6 items • Updated Sep 30, 2024 • 9
3D-VLA: A 3D Vision-Language-Action Generative World Model Paper • 2403.09631 • Published Mar 14, 2024 • 10
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 4 days ago • 60
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13, 2024 • 32
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper • 2408.03615 • Published Aug 7, 2024 • 31
Achieving Human Level Competitive Robot Table Tennis Paper • 2408.03906 • Published Aug 7, 2024 • 27
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model Paper • 2312.13252 • Published Dec 20, 2023 • 28