BAGEL-ReAlign (Paper Coming Soon)

A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.

This repository hosts the model weights for BAGEL-ReAlign. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 zero-shot improvement on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit.

For installation, usage instructions, and further documentation, please visit BAGEL's original GitHub repository.

🧠 Method

Coming soon! Stay tuned~

πŸ“Š Benchmarks

1. Visual Understanding

Remains Unchanged.

2. Text-to-Image Generation

We test it on 1024x1024 resolution.

Model GenEval ↑ DPGBench ↑ WISE ↑
BAGEL 0.787 84.03 0.50
BAGEL-ReAlign 0.824 85.29 0.52

3. Image Editing

Model GEdit-Bench-EN (SC) ↑ GEdit-Bench-EN (PQ) ↑ GEdit-Bench-EN (O) ↑ ImgEdit ↑
BAGEL 7.96 6.64 6.94 3.38
BAGEL-NHR 8.04 6.87 7.08 3.48
BAGEL-ReAlign 8.24 6.87 7.27 3.75
FLUX Kontext 6.95 7.30 6.27 3.59

image/png

License

BAGEL-ReAlign is licensed under the Apache 2.0 license.

✍️ Citation

Coming soon!

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sanaka87/BAGEL-ReAlign

Base model

Qwen/Qwen2.5-7B
Finetuned
(4)
this model

Dataset used to train sanaka87/BAGEL-ReAlign

Space using sanaka87/BAGEL-ReAlign 1