DC-AE-Lite
Collection
2 items
•
Updated
•
2
[github]
Decoding is often the speed bottleneck in few-step latent diffusion models. We release DC-AE-Lite to resolve this problem. It has the same encoder of DC-AE-f32c32-SANA-1.0 while having a much smaller decoder. Without training, it can be applied to diffusion model trained with DC-AE-f32c32-SANA-1.0.

DC-AE-Lite vs DC-AE reconstruction visual quality

DC-AE-Lite achieves 1.8× faster decoding than DC-AE with similar reconstruction quality
from diffusers import AutoencoderDC
from PIL import Image
import torch
import torchvision.transforms as transforms
from torchvision.utils import save_image
device = torch.device("cuda")
dc_ae_lite = AutoencoderDC.from_pretrained("dc-ai/dc-ae-lite-f32c32-diffusers").to(device).eval()
transform = transforms.Compose([
transforms.CenterCrop((1024,1024)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
image = Image.open("assets/fig/girl.png")
x = transform(image)[None].to(device)
latent = dc_ae_lite.encode(x).latent
print(f"latent shape: {latent.shape}")
y = dc_ae_lite.decode(latent).sample
save_image(y * 0.5 + 0.5, "demo_dc_ae_lite.png")