Comparison

vae

=== Metrics ===

SD15 VAE                   | MSE=2.732e-03 PSNR=28.10 LPIPS=0.147 Edge=0.206 KL=19.821 | Z[min/mean/max/std]=[-17.375, 0.072, 16.203, 0.900] | Skew[min/mean/max]=[-0.543, -0.126, 0.070] | Kurt[min/mean/max]=[-0.151, 1.228, 4.574]
SDXL VAE fp16 fix          | MSE=2.018e-03 PSNR=29.67 LPIPS=0.124 Edge=0.188 KL=32.222 | Z[min/mean/max/std]=[-4.066, -0.014, 4.301, 0.861] | Skew[min/mean/max]=[-0.017, 0.105, 0.165] | Kurt[min/mean/max]=[-0.380, -0.228, -0.107]
AiArtLab/sdxl_vae          | MSE=1.736e-03 PSNR=30.29 LPIPS=0.116 Edge=0.181 KL=32.222 | Z[min/mean/max/std]=[-4.066, -0.014, 4.301, 0.861] | Skew[min/mean/max]=[-0.017, 0.105, 0.165] | Kurt[min/mean/max]=[-0.380, -0.228, -0.107]
LTX-Video VAE              | MSE=1.202e-03 PSNR=31.84 LPIPS=0.141 Edge=0.168 KL=6.656 | Z[min/mean/max/std]=[-5.043, 0.011, 4.969, 0.272] | Skew[min/mean/max]=[-0.542, -0.018, 0.411] | Kurt[min/mean/max]=[-0.576, 0.741, 1.843]
Wan2.2-TI2V-5B             | MSE=7.782e-04 PSNR=34.25 LPIPS=0.052 Edge=0.121 KL=9.472 | Z[min/mean/max/std]=[-4.789, -0.012, 4.266, 0.375] | Skew[min/mean/max]=[-0.397, 0.022, 0.653] | Kurt[min/mean/max]=[-0.482, 0.006, 0.538]
AiArtLab/wan16x_vae        | MSE=7.275e-04 PSNR=34.51 LPIPS=0.051 Edge=0.118 KL=9.472 | Z[min/mean/max/std]=[-4.789, -0.012, 4.266, 0.375] | Skew[min/mean/max]=[-0.397, 0.022, 0.653] | Kurt[min/mean/max]=[-0.482, 0.006, 0.538]
Wan2.2-T2V-A14B            | MSE=7.073e-04 PSNR=34.59 LPIPS=0.048 Edge=0.115 KL=7.781 | Z[min/mean/max/std]=[-15.336, -0.159, 17.703, 2.563] | Skew[min/mean/max]=[-0.343, 0.006, 0.367] | Kurt[min/mean/max]=[-0.538, -0.071, 0.594]
QwenImage                  | MSE=6.549e-04 PSNR=35.21 LPIPS=0.047 Edge=0.110 KL=7.776 | Z[min/mean/max/std]=[-15.297, -0.158, 17.688, 2.561] | Skew[min/mean/max]=[-0.346, 0.005, 0.368] | Kurt[min/mean/max]=[-0.538, -0.072, 0.597]
AuraDiffusion/16ch-vae     | MSE=5.361e-04 PSNR=35.80 LPIPS=0.041 Edge=0.100 KL=4.421 | Z[min/mean/max/std]=[-1.373, -0.005, 1.621, 0.165] | Skew[min/mean/max]=[-0.331, 0.040, 0.413] | Kurt[min/mean/max]=[-0.170, 0.303, 0.670]
FLUX.1-schnell VAE         | MSE=4.594e-04 PSNR=35.87 LPIPS=0.035 Edge=0.088 KL=13.016 | Z[min/mean/max/std]=[-5.824, -0.076, 6.246, 0.945] | Skew[min/mean/max]=[-0.268, 0.048, 0.483] | Kurt[min/mean/max]=[-0.498, 0.037, 0.568]
AiArtLab/simplevae         | MSE=4.818e-04 PSNR=36.20 LPIPS=0.035 Edge=0.095 KL=4.032 | Z[min/mean/max/std]=[-7.762, -0.061, 9.914, 0.965] | Skew[min/mean/max]=[-0.320, 0.044, 0.411] | Kurt[min/mean/max]=[-0.045, 0.346, 0.696]

=== Percent ===

| Model                      |      PSNR |     LPIPS |      Edge |
|----------------------------|-----------|-----------|-----------|
| SD15 VAE                   |      100% |      100% |      100% |
| SDXL VAE fp16 fix          |    105.6% |    118.3% |    109.7% |
| AiArtLab/sdxl_vae          |    107.8% |    126.8% |    113.8% |
| LTX-Video VAE              |    113.3% |    103.8% |    122.5% |
| Wan2.2-TI2V-5B             |    121.9% |    280.8% |    170.8% |
| AiArtLab/wan16x_vae        |    122.8% |    287.3% |    174.2% |
| Wan2.2-T2V-A14B            |    123.1% |    303.2% |    179.4% |
| QwenImage                  |    125.3% |    308.8% |    188.0% |
| AuraDiffusion/16ch-vae     |    127.4% |    355.5% |    206.6% |
| FLUX.1-schnell VAE         |    127.6% |    424.4% |    234.8% |
| AiArtLab/simplevae         |    128.8% |    415.2% |    217.7% |

Compare

https://imgsli.com/NDE1MzE0/5/2

Diffusers

from diffusers import AutoencoderKL
vae = AutoencoderKL.from_pretrained("AiArtLab/simplevae",subfolder="vae").cuda().half()

VAE Training Process

  • Inited from AuraDiffusion/16ch-vae (not compatible), added mid block/retrained
  • Dataset: 100,000 PNG images
  • Training Time: ~ 2 weeks
  • Hardware: Single RTX 5090
  • Resolution: 512px
  • Precision: FP32
  • Effective Batch Size: 16
  • Optimizer: AdamW (8-bit)
  • Balanced losses (lpips, MSE, MAE, Edge, KL)

Source

https://huggingface.co/AiArtLab/simplevae/blob/main/train_vae.py

Acknowledgments

  • Stan โ€” Key investor. Thank you for believing in us when others called it madness.
  • Captainsaturnus
  • Love. Death. Transformers.
  • TOPAPEC

Donations

Please contact with us if you may provide some GPU's or money on training

DOGE: DEw2DR8C7BnF8GgcrfTzUjSnGkuMeJhg83

BTC: 3JHv9Hb8kEW8zMAccdgCdZGfrHeMhH1rpN

Contacts

recoilme

Test training

test train

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including AiArtLab/simplevae