BAGEL-ReAlign (Paper Coming Soon)

A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.

This repository hosts the model weights for BAGEL-ReAlign. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 zero-shot improvement on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit.

For installation, usage instructions, and further documentation, please visit BAGEL's original GitHub repository.

🧠 Method

Coming soon! Stay tuned~

📊 Benchmarks

1. Visual Understanding

Remains Unchanged.

2. Text-to-Image Generation

We test it on 1024x1024 resolution.

Model	GenEval ↑	DPGBench ↑	WISE ↑
BAGEL	0.787	84.03	0.50
BAGEL-ReAlign	0.824	85.29	0.52

3. Image Editing

Model	GEdit-Bench-EN (SC) ↑	GEdit-Bench-EN (PQ) ↑	GEdit-Bench-EN (O) ↑	ImgEdit ↑
BAGEL	7.96	6.64	6.94	3.38
BAGEL-NHR	8.04	6.87	7.08	3.48
BAGEL-ReAlign	8.24	6.87	7.27	3.75
FLUX Kontext	6.95	7.30	6.27	3.59

License

BAGEL-ReAlign is licensed under the Apache 2.0 license.

✍️ Citation

Coming soon!

sanaka87
/

BAGEL-ReAlign