arxiv:2412.11224

GenLit: Reformulating Single-Image Relighting as Video Generation

Published on Dec 15, 2024

Authors:

Abstract

GenLit uses a video diffusion model to enable single-image relighting by inserting and manipulating point lights, generating realistic video sequences without explicit 3D reconstruction or ray tracing.

AI-generated summary

Manipulating the illumination of a 3D scene within a single image represents a fundamental challenge in computer vision and graphics. This problem has traditionally been addressed using inverse rendering techniques, which involve explicit 3D asset reconstruction and costly ray-tracing simulations. Meanwhile, recent advancements in visual foundation models suggest that a new paradigm could soon be possible -- one that replaces explicit physical models with networks that are trained on large amounts of image and video data. In this paper, we exploit the physical world understanding of a video diffusion model, particularly Stable Video Diffusion, to relight a single image. We introduce GenLit, a framework that distills the ability of a graphics engine to perform light manipulation into a video-generation model, enabling users to directly insert and manipulate a point light in the 3D world within a given image, and generate results directly as a video sequence. We find that a model fine-tuned on only a small synthetic dataset generalizes to real-world scenes, enabling single-image relighting with plausible and convincing shadows. Our results highlight the ability of video foundation models to capture rich information about lighting, material, and, shape and our findings indicate that such models, with minimal training, can be used to perform relighting without explicit asset reconstruction or complex ray tracing. Project page: https://genlit.is.tue.mpg.de/.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.11224 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.11224 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.11224 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.