Papers
arxiv:2511.10629

One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models

Published on Nov 13
· Submitted by Aleksandr Razin on Nov 14
#1 Paper of the day
Authors:

Abstract

LUA is a lightweight module that performs super-resolution directly in the latent space of diffusion models, improving efficiency without compromising image quality.

AI-generated summary

Diffusion models struggle to scale beyond their training resolutions, as direct high-resolution sampling is slow and costly, while post-hoc image super-resolution (ISR) introduces artifacts and additional latency by operating after decoding. We present the Latent Upscaler Adapter (LUA), a lightweight module that performs super-resolution directly on the generator's latent code before the final VAE decoding step. LUA integrates as a drop-in component, requiring no modifications to the base model or additional diffusion stages, and enables high-resolution synthesis through a single feed-forward pass in latent space. A shared Swin-style backbone with scale-specific pixel-shuffle heads supports 2x and 4x factors and remains compatible with image-space SR baselines, achieving comparable perceptual quality with nearly 3x lower decoding and upscaling time (adding only +0.42 s for 1024 px generation from 512 px, compared to 1.87 s for pixel-space SR using the same SwinIR architecture). Furthermore, LUA shows strong generalization across the latent spaces of different VAEs, making it easy to deploy without retraining from scratch for each new decoder. Extensive experiments demonstrate that LUA closely matches the fidelity of native high-resolution generation while offering a practical and efficient path to scalable, high-fidelity image synthesis in modern diffusion pipelines.

Community

Paper author Paper submitter

We show that high-res synthesis can be done by upscaling in latent space instead of pixels, keeping a single final decode. This preserves quality while cutting latency and works across scales.

Is there a comfyui node available?

Paper author Paper submitter

Not yet, there isn’t a ComfyUI node available at the moment.
We’re planning to add supplementary material to the paper, release the code on GitHub and Hugging Face, and provide a ComfyUI node as soon as we can!
Thanks!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

how about a comparision with seedvr2, seedvr2 is the top1 for now😄

·

This has nothing to do with "upscaling" as you would imagine, it's a method to resize in latent space, this won't add any details this is not the point here

Hurry up! We need comfyui nodes!

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.10629 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.10629 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.10629 in a Space README.md to link it from this page.

Collections including this paper 4