arxiv:2508.16401

Audio2Face-3D: Audio-driven Realistic Facial Animation For Digital Avatars

Published on Aug 22

Authors:

Abstract

NVIDIA Audio2Face-3D provides a real-time system for generating realistic facial animations driven by audio input, offering tools for creators and developers.

AI-generated summary

Audio-driven facial animation presents an effective solution for animating digital avatars. In this paper, we detail the technical aspects of NVIDIA Audio2Face-3D, including data acquisition, network architecture, retargeting methodology, evaluation metrics, and use cases. Audio2Face-3D system enables real-time interaction between human users and interactive avatars, facilitating facial animation authoring for game characters. To assist digital avatar creators and game developers in generating realistic facial animations, we have open-sourced Audio2Face-3D networks, SDK, training framework, and example dataset.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.16401 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.16401 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.16401 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.