arxiv:2506.10036

Token Perturbation Guidance for Diffusion Models

Published on Jun 10

· Submitted by

msadat97 on Jun 13

Upvote

Authors:

Seyedmorteza Sadat ,

Abstract

Token Perturbation Guidance (TPG) enhances diffusion models with condition-agnostic, training-free guidance, similar to classifier-free guidance (CFG), without requiring architectural changes.

AI-generated summary

Classifier-free guidance (CFG) has become an essential component of modern diffusion models to enhance both generation quality and alignment with input conditions. However, CFG requires specific training procedures and is limited to conditional generation. To address these limitations, we propose Token Perturbation Guidance (TPG), a novel method that applies perturbation matrices directly to intermediate token representations within the diffusion network. TPG employs a norm-preserving shuffling operation to provide effective and stable guidance signals that improve generation quality without architectural changes. As a result, TPG is training-free and agnostic to input conditions, making it readily applicable to both conditional and unconditional generation. We further analyze the guidance term provided by TPG and show that its effect on sampling more closely resembles CFG compared to existing training-free guidance techniques. Extensive experiments on SDXL and Stable Diffusion 2.1 show that TPG achieves nearly a 2times improvement in FID for unconditional generation over the SDXL baseline, while closely matching CFG in prompt alignment. These results establish TPG as a general, condition-agnostic guidance method that brings CFG-like benefits to a broader class of diffusion models. The code is available at https://github.com/TaatiTeam/Token-Perturbation-Guidance

View arXiv page View PDF GitHub repository Add to collection

Community

msadat97

Paper author Paper submitter about 21 hours ago

•

edited about 21 hours ago

TLDR: This paper proposes Token Perturbation Guidance (TPG), a simple yet effective method based on token shuffling for extending the benefits of classifier-free guidance to broader settings, including unconditional generation. Compared to existing perturbation-based guidance methods, TPG achieves better quality and prompt alignment, making it a simple plug-and-play module to improve the quality of diffusion models.

librarian-bot

about 2 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.10036 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.10036 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.10036 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.