BAGEL-ReAlign (Paper Coming Soon)
A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.
This repository hosts the model weights for BAGEL-ReAlign. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 zero-shot improvement on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit.
For installation, usage instructions, and further documentation, please visit BAGEL's original GitHub repository.
π§ Method
Coming soon! Stay tuned~
π Benchmarks
1. Visual Understanding
Remains Unchanged.
2. Text-to-Image Generation
We test it on 1024x1024 resolution.
Model | GenEval β | DPGBench β | WISE β |
---|---|---|---|
BAGEL | 0.787 | 84.03 | 0.50 |
BAGEL-ReAlign | 0.824 | 85.29 | 0.52 |
3. Image Editing
Model | GEdit-Bench-EN (SC) β | GEdit-Bench-EN (PQ) β | GEdit-Bench-EN (O) β | ImgEdit β |
---|---|---|---|---|
BAGEL | 7.96 | 6.64 | 6.94 | 3.38 |
BAGEL-NHR | 8.04 | 6.87 | 7.08 | 3.48 |
BAGEL-ReAlign | 8.24 | 6.87 | 7.27 | 3.75 |
FLUX Kontext | 6.95 | 7.30 | 6.27 | 3.59 |
License
BAGEL-ReAlign is licensed under the Apache 2.0 license.
βοΈ Citation
Coming soon!
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support