sst12345
/

CoRe2

inference-enhanced algorithm

weak-to-strong guidance

Model card Files Files and versions

CoRe2 / README.md

Klayand's picture

update README.md

193a505 8 months ago

|

history blame contribute delete

2.81 kB

	---
	license: mit
	language:
	- en
	metrics:
	- T2I-Compbench
	- GenEval
	- PickScore
	- AES
	- ImageReward
	- HPSV2
	new_version: v0.1
	pipeline_tag: text-to-image
	library_name: diffusers
	tags:
	- inference-enhanced algorithm
	- efficiency
	- effectiveness
	- generalization
	- weak-to-strong guidance
	---

	# The Official Implementation of our Arxiv 2025 paper:

	> [CoRe^2: _Collect, Reflect and Refine_ to Generate Better and Faster](https://arxiv.org/abs/2503.09662) <br>

	Authors:

	>*<em>Shitong Shao, Zikai Zhou, Dian Xie, Yuetong Fang, Tian Ye, Lichen Bai</em> and <em>Zeke Xie</em>** <br>
	> xLeaf Lab, HKUST (GZ) <br>
	> *: Corresponding author

	## New

	- [x] Release the inference code of SD3.5 and SDXL.

	- [ ] Release the inference code of FLUX.

	- [ ] Release the inference code of LlamaGen.

	- [ ] Release the implementation of the Collect phase.

	- [ ] Release the implementation of the Reflect phase.


	## Overview

	This guide provides instructions on how to use the CoRe^2.

	Here we provide the inference code which supports different models like *Stable Diffusion XL, Stable Diffusion 3.5 Large.*

	## Requirements

	- `python version == 3.8`
	- `pytorch with cuda version`
	- `diffusers`
	- `PIL`
	- `bitsandbytes`
	- `numpy`
	- `timm`
	- `argparse`
	- `einops`

	## Installation🚀️

	Make sure you have successfully built `python` environment and installed `pytorch` with cuda version. Before running the script, ensure you have all the required packages installed. You can install them using:

	```bash
	pip install diffusers, PIL, numpy, timm, argparse, einops
	```

	## Usage👀️

	To use the CoRe^2 pipeline, you need to run the `sample_img.py` script with appropriate command-line arguments. Below are the available options:

	### Command-Line Arguments

	- `--pipeline`: Select the model pipeline (`sdxl`, `sd35`). Default is `sdxl`.
	- `--prompt`: The textual prompt based on which the image will be generated. Default is "Mickey Mouse painting by Frank Frazetta."
	- `--inference-step`: Number of inference steps for the diffusion process. Default is 50.
	- `--cfg`: Classifier-free guidance scale. Default is 5.5.
	- `--pretrained-path`: Path to the pretrained model weights. Default is a specified path in the script.
	- `--size`: The size (height and width) of the generated image. Default is 1024.
	- `--method`: Select the inference method (`standard`, `core`, `zigzag`, `z-core`)

	### Running the Script

	Run the script from the command line by navigating to the directory containing `sample_img.py` and executing:

	```
	python sample_img.py --pipeline sdxl --prompt "A banana on the left of an apple." --size 1024
	```

	This command will generate an image based on the prompt using the Stable Diffusion XL model with an image size of 1024x1024 pixels.

	### Output🎉️

	The script will save one image.