MindJourney: Test-Time Scaling with World Models for Spatial Reasoning Paper • 2507.12508 • Published about 1 month ago • 26
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Paper • 2506.17218 • Published Jun 20 • 27
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering Paper • 2505.23604 • Published May 29 • 24
3D-VLA: A 3D Vision-Language-Action Generative World Model Paper • 2403.09631 • Published Mar 14, 2024 • 10
3D-LLM: Injecting the 3D World into Large Language Models Paper • 2307.12981 • Published Jul 24, 2023 • 37
3D-LLM: Injecting the 3D World into Large Language Models Paper • 2307.12981 • Published Jul 24, 2023 • 37