FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models

FastFit is a diffusion-based framework optimized for high-speed, multi-reference virtual try-on. It enables simultaneous try-on of multiple fashion items—such as tops, bottoms, dresses, shoes, and bags—on a single person. The framework leverages reference KV caching during inference to significantly accelerate generation.

Updates

2025/08/06: ⚙️ We release the code for inference and evaluation on the DressCode-MR, DressCode, and VITON-HD test datasets.
2025/08/05: 🧩 We release the ComfyUI workflow for FastFit!
2025/08/04: 🚀 Our gradio demo is online with Chinese & English support! The code of the demo is also released in app.py.
2025/07/03: 🎉 We release the weights of FastFit-MR and FastFit-SR model on Hugging Face!
2025/06/24: 👕 We release DressCode-MR dataset with 28K+ Multi-reference virtual try-on Samples on Hugging Face!

DressCode-MR Dataset

DressCode-MR is constructed based on the DressCode dataset with 28K+ Multi-reference virtual try-on Samples.

Multi-reference Samples: Each sample comprises a person's image paired with a set of compatible clothing and accessory items: tops, bottoms, dresses, shoes, and bags.
Large Scale: Contains a total of 28,179 high-quality multi-reference samples with 25,779 for training and 2,400 for testing.

DressCode-MR is released under the exact same license as the original DressCode dataset. Therefore, before requesting access to DressCode-MR dataset, you must complete the following steps:

Apply and be granted a license to use the DressCode dataset.
Use your educational/academic email address (e.g., one ending in .edu, .ac, etc.) to request access to DressCode-MR on Hugging Face. (Any requests from non-academic email addresses will be rejected.)

Installation

conda create -n fastfit python=3.10
conda activate fastfit
pip install -r requirements.txt
pip install easy-dwpose --no-dependencies # to resolve the version conflict

# if error occurs for av, try:
conda install -c conda-forge av

ComfyUI Workflow

Clone the FastFit repository into your ComfyUI/custom_nodes/ directory.

cd  Your_ComfyUI_Dir/custom_nodes
git clone https://github.com/Zheng-Chong/FastFit.git

Install the required dependencies.

cd FastFit
pip install -r requirements.txt
pip install easy-dwpose --no-dependencies # to resolve the version conflict

# if error occurs for av, try:
conda install -c conda-forge av

Install rgthree-comfy for image comparer.

cd  Your_ComfyUI_Dir/custom_nodes
git clone https://github.com/rgthree/rgthree-comfy.git
cd rgthree-comfy
pip install -r requirements.txt

Restart ComfyUI.
Drag and drop the fastfit_workflow.json file onto the ComfyUI web interface.

Gradio Demo

The model weights will be automatically downloaded from Hugging Face when you run the demo.

python app.py

Inference & Evaluation on Datasets

To perform inference on the DressCode-MR, DressCode, or VITON-HD test datasets, use the infer_datasets.py script, for example:

python infer_datasets.py \
    --dataset <dataset_name> \
    --data_dir </path/to/your/dataset> \
    --batch_size 4 \
    --num_inference_steps 50 \
    --guidance_scale 2.5 \
    --mixed_precision bf16 \
    --paired

--dataset: Specify the target dataset. Choose from dresscode-mr, dresscode, or viton-hd.
--data_dir: The root directory path for the specified dataset.
--paired: Include this flag to run inference in the paired setting. Omit this flag for the unpaired setting.

By default, inference results will be saved to the results/ directory at the project root.

After inference, use the eval.py script to ecalculate the evaluation metrics:

python eval.py \
    --gt_folder </path/to/ground_truth_folder> \
    --pred_folder </path/to/prediction_folder> \
    --paired \
    --batch_size 16 \
    --num_workers 4

--gt_folder: The directory path containing the ground truth images.
--pred_folder: The directory path containing the generated (predicted) images from the inference step.
--paired: Include this flag to evaluate results from the paired setting. Omit this flag for the unpaired setting.

Acknowledgement

Our code is modified based on Diffusers. We adopt Stable Diffusion v1.5 inpainting as the base model. We use a modified AutoMasker to automatically generate masks in our Gradio App and ComfyUI workflow. Thanks to all the contributors!

License

All weights, parameters, and code related to FastFit are governed by the FastFit Non-Commercial License. For commercial collaboration, please contact LavieAI or LoomlyAI.

zhengchong
/

FastFit-SR-1024