EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence
Abstract
EmbodiedGen is a platform that generates high-quality, photorealistic 3D assets at low cost, enabling scalable and realistic embodied AI research through generative AI techniques.
Constructing a physically realistic and accurately scaled simulated 3D world is crucial for the training and evaluation of embodied intelligence tasks. The diversity, realism, low cost accessibility and affordability of 3D data assets are critical for achieving generalization and scalability in embodied AI. However, most current embodied intelligence tasks still rely heavily on traditional 3D computer graphics assets manually created and annotated, which suffer from high production costs and limited realism. These limitations significantly hinder the scalability of data driven approaches. We present EmbodiedGen, a foundational platform for interactive 3D world generation. It enables the scalable generation of high-quality, controllable and photorealistic 3D assets with accurate physical properties and real-world scale in the Unified Robotics Description Format (URDF) at low cost. These assets can be directly imported into various physics simulation engines for fine-grained physical control, supporting downstream tasks in training and evaluation. EmbodiedGen is an easy-to-use, full-featured toolkit composed of six key modules: Image-to-3D, Text-to-3D, Texture Generation, Articulated Object Generation, Scene Generation and Layout Generation. EmbodiedGen generates diverse and interactive 3D worlds composed of generative 3D assets, leveraging generative AI to address the challenges of generalization and evaluation to the needs of embodied intelligence related research. Code is available at https://horizonrobotics.github.io/robot_lab/embodied_gen/index.html.
Community
EmbodiedGen is a toolkit to generate diverse and interactive 3D worlds composed of generative 3D assets with plausible physics, leveraging generative AI to address the challenges of generalization in embodied intelligence related research. EmbodiedGen composed of six key modules: Image-to-3D
, Text-to-3D
, Texture Generation
, Articulated Object Generation
, Scene Generation
and Layout Generation
. The project has been open-sourced on GitHub. You're welcome to follow our work and feel free to try out the demo on Hugging Face Spaces.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning (2025)
- DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data (2025)
- ORV: 4D Occupancy-centric Robot Video Generation (2025)
- Dreamland: Controllable World Creation with Simulator and Generative Models (2025)
- Agentic 3D Scene Generation with Spatially Contextualized VLMs (2025)
- RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer (2025)
- Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper