Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.12202

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 9 days ago • 31
tencent/Hunyuan3D-2

Image-to-3D • Updated 7 days ago • 24.2k • 655

2025 LLM Papers on Hugging Face with Japanese Memos

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published 25 days ago • 40
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 29 days ago • 99
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published 9 days ago • 79
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published 14 days ago • 24

about 22 hours ago

Video Creation by Demonstration

Paper • 2412.09551 • Published Dec 12, 2024 • 9
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published Dec 6, 2024 • 38

about 11 hours ago

Runtime error

451

🧪

FLUX LoRa Lab
black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 795k • 3.3k
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 9 days ago • 31
Running on Zero

990

🌍

Hunyuan3D-2.0

Text-to-3D and Image-to-3D Generation

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 28
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58
The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 39
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 29

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Paper • 2405.14979 • Published May 23, 2024 • 17
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting

Paper • 2405.19957 • Published May 30, 2024 • 10
GECO: Generative Image-to-3D within a SECOnd

Paper • 2405.20327 • Published May 30, 2024 • 10
gsplat: An Open-Source Library for Gaussian Splatting

Paper • 2409.06765 • Published Sep 10, 2024 • 15

about 9 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 26
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 37
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 52
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

about 12 hours ago

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17, 2024 • 11
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Paper • 2401.10171 • Published Jan 18, 2024 • 14
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 22
GALA: Generating Animatable Layered Assets from a Single Scan

Paper • 2401.12979 • Published Jan 23, 2024 • 9

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs