Papers
arxiv:2508.08978

TaoCache: Structure-Maintained Video Generation Acceleration

Published on Aug 12
Authors:
,
,

Abstract

TaoCache is a caching strategy for video diffusion models that improves visual quality by predicting noise outputs in late denoising stages, outperforming existing methods while maintaining speed.

AI-generated summary

Existing cache-based acceleration methods for video diffusion models primarily skip early or mid denoising steps, which often leads to structural discrepancies relative to full-timestep generation and can hinder instruction following and character consistency. We present TaoCache, a training-free, plug-and-play caching strategy that, instead of residual-based caching, adopts a fixed-point perspective to predict the model's noise output and is specifically effective in late denoising stages. By calibrating cosine similarities and norm ratios of consecutive noise deltas, TaoCache preserves high-resolution structure while enabling aggressive skipping. The approach is orthogonal to complementary accelerations such as Pyramid Attention Broadcast (PAB) and TeaCache, and it integrates seamlessly into DiT-based frameworks. Across Latte-1, OpenSora-Plan v110, and Wan2.1, TaoCache attains substantially higher visual quality (LPIPS, SSIM, PSNR) than prior caching methods under the same speedups.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.08978 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.08978 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.08978 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.