Difference between hugging face and GitHub weights

#6
by NeilW - opened

Even though both set of weights seem to have been updated on June 2024, there seems to be a big difference in the weights. I noticed this when working on an inpainting algorithm. For that to work you need to merge with the original so you don't degrade the image except where you are inpainting. I noticed an issue that the inpainting wasn't matching the brightness of the original, so I tried both versions of the weights. They are obviously different because the inpainting algorithm is not compatible between the two. But this is just a fine tuning issue once I pick which set of weights to use. Anyway, both have a change in brightness, but the hugging face weights are much worse. In the images below the top row is the original with a masked area to inpaint. The middle image is with the mask area replaced by a round trip of the vae.decoder(van.encoder(image)), the bottom image is just the roundtrip. ideally you shouldn't be able to detect the inpaint areas in the middle image. Note, the inputs are correctly scaled to [0,1]

Screenshot 2025-09-09 at 11.31.16 AM.png

Screenshot 2025-09-09 at 11.26.02 AM.png

Hmm, the weights themselves seem identical (notebook).

image.png

Also, the modules seem to produce identical roundtrip output (notebook).

image.png

So, I'm not sure what's causing the issue in your screenshot. If you post an example notebook that reproduces the issue I can investigate further.

I was able to replicate your results locally and found my mistake in the code. I was also using open clip, which requires [0,1] input range and accidentally used that range for the hugging face version of the taesd. When I changed the input to [-1,1] the results matched.

Sorry about the inconvenience. It can sometimes be difficult to keep the input ranges straight when different libraries expect different ranges.

NeilW changed discussion status to closed

Ah, that makes sense! The different scaling factors (input ranges as well as the VAE latent scales) are definitely annoying to deal with (see e.g. https://github.com/NVlabs/edm2/issues/9 😅). Glad it's working now.

Sign up or log in to comment