Papers
arxiv:2506.04120

Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data

Published on Jun 4
· Submitted by MauroC on Jun 9
Authors:
,
,
,
,
,

Abstract

A novel real-to-sim framework merges 3D Gaussian Splatting and object meshes for accurate physics simulation, refining geometry, appearance, and robot poses from raw trajectories.

AI-generated summary

Creating accurate, physical simulations directly from real-world robot motion holds great value for safe, scalable, and affordable robot learning, yet remains exceptionally challenging. Real robot data suffers from occlusions, noisy camera poses, dynamic scene elements, which hinder the creation of geometrically accurate and photorealistic digital twins of unseen objects. We introduce a novel real-to-sim framework tackling all these challenges at once. Our key insight is a hybrid scene representation merging the photorealistic rendering of 3D Gaussian Splatting with explicit object meshes suitable for physics simulation within a single representation. We propose an end-to-end optimization pipeline that leverages differentiable rendering and differentiable physics within MuJoCo to jointly refine all scene components - from object geometry and appearance to robot poses and physical parameters - directly from raw and imprecise robot trajectories. This unified optimization allows us to simultaneously achieve high-fidelity object mesh reconstruction, generate photorealistic novel views, and perform annotation-free robot pose calibration. We demonstrate the effectiveness of our approach both in simulation and on challenging real-world sequences using an ALOHA 2 bi-manual manipulator, enabling more practical and robust real-to-simulation pipelines.

Community

Paper submitter

We're excited to share our work on bridging the real-to-sim gap in robotics. Creating accurate, physics-ready simulations from real robot data is very challenging, especially when using low-cost hardware. Methods like 3D Gaussian Splatting (3DGS) are fantastic for photorealism, but their representations aren't directly compatible with physics engines. In our paper we introduce SplatMesh, a hybrid scene representation that combines 3DGS with explicit, physics-ready triangle meshes. We embed this in a fully differentiable, end-to-end framework that uses both differentiable rendering and differentiable physics with MuJoCo MJX.

This allows us to use raw RGB images to simultaneously refine everything: the object's geometry and appearance, the robot's pose, and the camera parameters. We also show that our approach and representation is comprehensive and allows to generate new assets, both in terms of Gaussian Splats and meshes. We demonstrate our method on a real ALOHA 2 manipulator, successfully reconstructing high-fidelity 3D assets from its imperfect trajectory data.

We hope this work makes creating high-fidelity digital twins more practical and robust!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.04120 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.04120 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.04120 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.