CoRe2 / README.md
Klayand's picture
update README.md
193a505
---
license: mit
language:
- en
metrics:
- T2I-Compbench
- GenEval
- PickScore
- AES
- ImageReward
- HPSV2
new_version: v0.1
pipeline_tag: text-to-image
library_name: diffusers
tags:
- inference-enhanced algorithm
- efficiency
- effectiveness
- generalization
- weak-to-strong guidance
---
# The Official Implementation of our Arxiv 2025 paper:
> **[CoRe^2: _Collect, Reflect and Refine_ to Generate Better and Faster](https://arxiv.org/abs/2503.09662)** <br>
Authors:
>**<em>Shitong Shao, Zikai Zhou, Dian Xie, Yuetong Fang, Tian Ye, Lichen Bai</em> and <em>Zeke Xie*</em>** <br>
> xLeaf Lab, HKUST (GZ) <br>
> *: Corresponding author
## New
- [x] Release the inference code of SD3.5 and SDXL.
- [ ] Release the inference code of FLUX.
- [ ] Release the inference code of LlamaGen.
- [ ] Release the implementation of the Collect phase.
- [ ] Release the implementation of the Reflect phase.
## Overview
This guide provides instructions on how to use the CoRe^2.
Here we provide the inference code which supports different models like ***Stable Diffusion XL, Stable Diffusion 3.5 Large.***
## Requirements
- `python version == 3.8`
- `pytorch with cuda version`
- `diffusers`
- `PIL`
- `bitsandbytes`
- `numpy`
- `timm`
- `argparse`
- `einops`
## Installation🚀️
Make sure you have successfully built `python` environment and installed `pytorch` with cuda version. Before running the script, ensure you have all the required packages installed. You can install them using:
```bash
pip install diffusers, PIL, numpy, timm, argparse, einops
```
## Usage👀️
To use the CoRe^2 pipeline, you need to run the `sample_img.py` script with appropriate command-line arguments. Below are the available options:
### Command-Line Arguments
- `--pipeline`: Select the model pipeline (`sdxl`, `sd35`). Default is `sdxl`.
- `--prompt`: The textual prompt based on which the image will be generated. Default is "Mickey Mouse painting by Frank Frazetta."
- `--inference-step`: Number of inference steps for the diffusion process. Default is 50.
- `--cfg`: Classifier-free guidance scale. Default is 5.5.
- `--pretrained-path`: Path to the pretrained model weights. Default is a specified path in the script.
- `--size`: The size (height and width) of the generated image. Default is 1024.
- `--method`: Select the inference method (`standard`, `core`, `zigzag`, `z-core`)
### Running the Script
Run the script from the command line by navigating to the directory containing `sample_img.py` and executing:
```
python sample_img.py --pipeline sdxl --prompt "A banana on the left of an apple." --size 1024
```
This command will generate an image based on the prompt using the Stable Diffusion XL model with an image size of 1024x1024 pixels.
### Output🎉️
The script will save one image.