TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer

Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences

arXiv GitHub Repo stars

TC-Light Demo

This repository contains the official implementations of TC-Light, a one-shot model used to manipulate the illumination distribution of video and realize realistic world transfer, presented in the paper TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer.

The code is available at: https://github.com/Linketic/TC-Light

It's especially suitable for high-dynamic videos such as motion-rich actions and frequent switch of foreground and background objects. It is distinguished by:

  • πŸ”₯ Outstanding Temporal Consistency on Highly Dynamic Scenarios.
  • πŸ”₯ Superior Computational Efficiency that Enables Long Video Processing (can process 300 frames with resolution of 1280x720 on 40G A100).

These features make it particularly valuable for sim2real and real2real augmentation for Embodied Agents or preparing video pairs to train stronger video relighting models. Star ⭐ us if you like it!

Abstract

Illumination and texture editing are critical dimensions for world-to-world transfer, which is valuable for applications including sim2real and real2real visual data scaling up for embodied AI. Existing techniques generatively re-render the input video to realize the transfer, such as video relighting models and conditioned world generation models. Nevertheless, these models are predominantly limited to the domain of training data (e.g., portrait) or fall into the bottleneck of temporal consistency and computation efficiency, especially when the input video involves complex dynamics and long durations. In this paper, we propose TC-Light, a novel generative renderer to overcome these problems. Starting from the video preliminarily relighted by an inflated video relighting model, it optimizes appearance embedding in the first stage to align global illumination. Then it optimizes the proposed canonical video representation, i.e., Unique Video Tensor (UVT), to align fine-grained texture and lighting in the second stage. To comprehensively evaluate performance, we also establish a long and highly dynamic video benchmark. Extensive experiments show that our method enables physically plausible re-rendering results with superior temporal coherence and low computation cost. The code and video demos are available at this https URL .

πŸ’‘ Method

TC-Light Pipeline

TC-Light overview. Given the source video and text prompt p, the model tokenizes input latents in xy plane and yt plane seperately. The predicted noises are combined together for denoising. Its output then undergoes two-stage optimization. The first stage aligns exposure by optimizing appearance embedding. The second stage aligns detailed texture and illumination by optimizing Unique Video Tensor, which is compressed version of video Please refer to the paper for more details.

πŸ’Ύ Preparation

Install the required environment as follows:

git clone https://github.com/Linketic/TC-Light.git
cd TC-Light
conda create -n tclight python=3.10
conda activate tclight
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

Then download required model weights to ./models from the following links:

⚑ Quick Start

As a quick start, you can use:

# support .mp4, .gif, .avi, and folder containing sequential images
# --multi_axis enables decayed multi-axis denoising, which enhances consistency but slow down the diffusion process
python run.py -i /path/to/your/video -p "your_prompt" \
              -n "your_negative_prompt" \  #  optional
              --multi_axis  # optional

By default, it will relight the first 30 frames with resolution 960x720. The default negative prompt is adopted from Cosmos-Transfer1, which makes the edited illumination as real as possible. If it is the first-time running on a specific video, it would generate and save flow un the path to your video.

For a fine-grained control, you can customize your .yaml config file and run:

python run.py --config path/to/your_config.yaml

You can start from configs/tclight_custom.yaml, which records the most frequently used parameters and detailed explanation.

Examples

relight the entire field of view

python run.py --config configs/examples/tclight_droid.yaml
python run.py --config configs/examples/tclight_navsim.yaml
python run.py --config configs/examples/tclight_scand.yaml

relight all three videos parallelly

bash scripts/relight.sh

relight foreground with static background condition

# we generate compatible background image by using foreground mode of IC-Light, then remove foreground and inpaint the image with tools like sider.ai
# for satisfactory results, a consistent and complete foreground segmentation is preferred, and we use BriaRMBG as default.
python run.py --config configs/examples/tclight_bkgd_robotwin.yaml

For evaluation, you can simply use:

python evaluate.py --output_dir path/to/your_output_dir --eval_cost

πŸ”Ž Behaviors

  1. Works better on video resolution over 512x512, which is the minimum resolution used to train IC-Light. A higher resolution helps consistency of image intrinsic properties.
  2. Works relatively better on realistic scenes than synthetics scenes, no matter in temporal consistency or physical plausibility.
  3. Stuggle to drastically change illumination of night scenarios or hard shadows, as done in IC-Light.

πŸ“ TODO List

  • Release the arXiv and the project page.
  • Release the code base.
  • Release the dataset.

πŸ€— Citation

If you find this repository useful for your research, please use the following BibTeX entry for citation.

@article{Liu2025TCLight,
  title={TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer},
  author={Liu, Yang and Luo, Chuanchen and Tang, Zimo and Li, Yingyan and Yang, Yuran and Ning, Yuanyong and Fan, Lue and Peng, Junran and Zhang, Zhaoxiang},
  journal={arXiv preprint arXiv:2506.18904},
  year={2025},
}

πŸ‘ Acknowledgements

This repo benefits from IC-Light, VidToMe, Slicedit, RAVE, Cosmos. Thanks for their great work! The repo is still under development, we are open to pull request and discussions!

Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support