Image-to-3D
Hunyuan3D-2
English
Chinese
Huiwenshi commited on
Commit
b727378
·
verified ·
1 Parent(s): 13ea048
Files changed (4) hide show
  1. .gitattributes +2 -0
  2. README.md +142 -0
  3. assets/framework.jpg +3 -0
  4. assets/omni_teaser.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/framework.jpg filter=lfs diff=lfs merge=lfs -text
37
+ assets/omni_teaser.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: hunyuan3d-2
3
+ license: other
4
+ license_name: tencent-hunyuan-community
5
+ license_link: https://github.com/Tencent-Hunyuan/Hunyuan3D-Omni/blob/main/LICENSE
6
+ language:
7
+ - en
8
+ - zh
9
+ tags:
10
+ - image-to-3d
11
+ - text-to-3d
12
+ pipeline_tag: image-to-3d
13
+ extra_gated_eu_disallowed: true
14
+ ---
15
+
16
+ <p align="center">
17
+ <img src="assets/omni_teaser.png">
18
+ </p>
19
+
20
+ <div align="center">
21
+ <a href=https://3d.hunyuan.tencent.com target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
22
+ <a href=https://huggingface.co/tencent/Hunyuan3D-Omni target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
23
+ <a href=https://3d-models.hunyuan.tencent.com/ target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
24
+ <a href=https://discord.gg/dNBrdrGGMa target="_blank"><img src= https://img.shields.io/badge/Discord-white.svg?logo=discord height=22px></a>
25
+ <a href=https://arxiv.org/pdf/2506.15442 target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
26
+ <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
27
+ <a href="#community-resources" target="_blank"><img src=https://img.shields.io/badge/Community-lavender.svg?logo=homeassistantcommunitystore height=22px></a>
28
+ </div>
29
+
30
+ [//]: # ( <a href=# target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>)
31
+
32
+ [//]: # ( <a href=# target="_blank"><img src= https://img.shields.io/badge/Colab-8f2628.svg?logo=googlecolab height=22px></a>)
33
+
34
+ [//]: # ( <a href="#"><img alt="PyPI - Downloads" src="https://img.shields.io/pypi/v/mulankit?logo=pypi" height=22px></a>)
35
+ <br>
36
+
37
+ # Hunyuan3D-Omni
38
+
39
+ Hunyuan3D-Omni is a unified framework for the controllable generation of 3D assets, which inherits the structure of Hunyuan3D 2.1. In contrast, Hunyuan3D-Omni constructs a unified control encoder to introduce additional control signals, including point cloud, voxel, skeleton, and bounding box.
40
+
41
+ <p align="left">
42
+ <img src="assets/framework.jpg">
43
+ </p>
44
+
45
+ ### Multi-Modal Conditional Control
46
+ - **Bounding Box Control**: Generate 3D models constrained by 3D bounding boxes
47
+ - **Pose Control**: Create 3D human models with specific skeletal poses
48
+ - **Point Cloud Control**: Generate 3D models guided by input point clouds
49
+ - **Voxel Control**: Create 3D models from voxel representations
50
+
51
+ ## 🎁 Models Zoo
52
+
53
+ It takes 10 GB VRAM for generation.
54
+
55
+
56
+ | Model | Description | Date | Size | Huggingface |
57
+ |----------------------------|-----------------------------|------------|------|-------------------------------------------------------------------------------------------|
58
+ | Hunyuan3D-Omni | Image to Shape Model with multi-modal control | 2025-09-25 | 3.3B | [Download](https://huggingface.co/tencent/Hunyuan3D-Omni/tree/main) |
59
+
60
+
61
+ ## Installation
62
+
63
+ ### Requirements
64
+ We test our model with Python 3.10.
65
+ ```bash
66
+ pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
67
+ pip install -r requirements.txt
68
+ ```
69
+
70
+ ## Usage
71
+
72
+ ### Inference
73
+
74
+ #### Multi-Modal Inference
75
+ ```bash
76
+ python inference.py --control_type <control_type> [--use_ema] [--flashvdm]
77
+ ```
78
+ The `control_type` parameter has four available options:
79
+
80
+ `point`: Use point control type for inference.
81
+ `voxel`: Use voxel control type for inference.
82
+ `bbox`: Use bounding box control type for inference.
83
+ `pose`: Use pose control type for inference.
84
+
85
+ The `--use_ema` flag enables the use of Exponential Moving Average (EMA) model for more stable inference.
86
+
87
+ The `--flashvdm` flag enables FlashVDM optimization for faster inference speed.
88
+
89
+ Please choose the appropriate control_type based on your requirements. For example, if you want to use the `point` control type, you can run:
90
+ ```bash
91
+ python inference.py --control_type point
92
+ python inference.py --control_type point --use_ema
93
+ python inference.py --control_type point --flashvdm
94
+ ```
95
+
96
+ ## Acknowledgements
97
+
98
+ We would like to thank the contributors to
99
+ the [TripoSG](https://github.com/VAST-AI-Research/TripoSG), [Trellis](https://github.com/microsoft/TRELLIS), [DINOv2](https://github.com/facebookresearch/dinov2), [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), [FLUX](https://github.com/black-forest-labs/flux), [diffusers](https://github.com/huggingface/diffusers), [HuggingFace](https://huggingface.co), [CraftsMan3D](https://github.com/wyysf-98/CraftsMan3D), [Michelangelo](https://github.com/NeuralCarver/Michelangelo/tree/main), [Hunyuan-DiT](https://github.com/Tencent-Hunyuan/HunyuanDiT), [HunyuanVideo](https://github.com/Tencent-Hunyuan/HunyuanVideo), [HunyuanWorld-1.0](https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0), and [HunyuanWorld-Voyager](https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager) repositories, for their open research and
100
+ exploration.
101
+
102
+ ## Citation
103
+
104
+ If you use this code in your research, please cite:
105
+ ```bibtex
106
+ @misc{hunyuan3d2025hunyuan3d,
107
+ title={Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material},
108
+ author={Tencent Hunyuan3D Team},
109
+ year={2025},
110
+ eprint={2506.15442},
111
+ archivePrefix={arXiv},
112
+ primaryClass={cs.CV}
113
+ }
114
+
115
+ @misc{hunyuan3d22025tencent,
116
+ title={Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation},
117
+ author={Tencent Hunyuan3D Team},
118
+ year={2025},
119
+ eprint={2501.12202},
120
+ archivePrefix={arXiv},
121
+ primaryClass={cs.CV}
122
+ }
123
+
124
+ @misc{yang2024hunyuan3d,
125
+ title={Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation},
126
+ author={Tencent Hunyuan3D Team},
127
+ year={2024},
128
+ eprint={2411.02293},
129
+ archivePrefix={arXiv},
130
+ primaryClass={cs.CV}
131
+ }
132
+ ```
133
+
134
+ ## Star History
135
+
136
+ <a href="https://star-history.com/#Tencent-Hunyuan/Hunyuan3D-Omni&Date">
137
+ <picture>
138
+ <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Tencent-Hunyuan/Hunyuan3D-Omni&type=Date&theme=dark" />
139
+ <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Tencent-Hunyuan/Hunyuan3D-Omni&type=Date" />
140
+ <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Tencent-Hunyuan/Hunyuan3D-Omni&type=Date" />
141
+ </picture>
142
+ </a>
assets/framework.jpg ADDED

Git LFS Details

  • SHA256: 921345d3312b8ea7b07d0e0f5fa296b32b8d7445f96f982f256ba560dc62a0a3
  • Pointer size: 132 Bytes
  • Size of remote file: 1.49 MB
assets/omni_teaser.png ADDED

Git LFS Details

  • SHA256: 9688805c9e927d451d81b435130a5f951be770aafb1ccb3fc886962aeb224d39
  • Pointer size: 132 Bytes
  • Size of remote file: 9.59 MB