SVGDreamer: Text Guided SVG Generation with Diffusion Model
This repository contains the official implementation of our CVPR 2024 paper, "SVGDreamer: Text-Guided SVG Generation with Diffusion Model." The method leverages a diffusion-based approach to produce high-quality SVGs guided by text prompts.
:new: Latest Update
- [11/2024] ๐ฅ We released the SVGDreamer++, offering stronger visual representation and improved editing capabilities.
- [03/2024] ๐ฅ We released the code for SVGDreamer.
- [02/2024] ๐ SVGDreamer accepted by CVPR2024. ๐
- [12/2023] ๐ฅ We released the SVGDreamer Paper. SVGDreamer is a novel text-guided vector graphics synthesis method. This method considers both the editing of vector graphics and the quality of the synthesis.
๐ Installation Guide
๐ ๏ธ Step 1: Set Up the Environment
To quickly get started with SVGDreamer, follow the steps below.
These instructions will help you run quick inference locally.
๐ Option 1: Standard Installation
Run the following command in the top-level directory:
chmod +x script/install.sh
bash script/install.sh
๐ณ Option 2: Using Docker
chmod +x script/run_svgdreamer_docker.sh
sudo bash script/run_svgdreamer_docker.sh
๐ ๏ธ Step 2: Download Pretrained Stable Diffusion Model
SVGDreamer requires a pretrained Stable Diffusion (SD) model. You can download it automatically or manually.
๐ Option 1: Auto-Download (Recommended)
Set diffuser.download=True
in /conf/config.yaml
before running SVGDreamer.
Alternatively, append diffuser.download=True
to the execution script.
โฌ๏ธ Option 2: Manual Download
If you prefer manual setup, download the model from Hugging Face:
๐ Model Link: Stable Diffusion 2.1 Base
The model will be stored at:
๐ Default Path: /home/user/.cache/huggingface/hub/models--stabilityai--stable-diffusion-2-1-base
๐ฅ Quickstart: synthesize 6 SVGs at once
SIVE + VPSD
Prompt: an image of Batman. full body action pose, complete detailed body, white background, high quality, 4K, ultra
realistic
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=iconography skip_sive=False "prompt='an image of Batman. full body action pose, complete detailed body. white background. empty background, high quality, 4K, ultra realistic'" token_ind=4 x.vpsd.t_schedule='randint' result_path='./logs/batman' multirun=True
๐นParameter:
x=iconography
(str): style configsskip_sive
(bool): enable the SIVE stagetoken_ind
(int): the index of text prompt, from 1result_path
(str): the path to save the resultmultirun
(bool): run the script multiple times with different random seedsmv
(bool): save the intermediate results of the run and record the video (This increases the run time)
More parameters in ./conf/x/style.yaml
, you can modify these parameters from the command line. For example,
append x.vpsd.n_particle=4
to the end of the script.
SIVE
Prompt: an astronaut walking across a desert, planet mars in the background, floating beside planets, space
art
Preview:
attn-map | bg init | fg init | bg final | fg final | final |
---|---|---|---|---|---|
![]() |
Script:
python svgdreamer.py x=iconography-s1 skip_sive=False "prompt='an astronaut walking across a desert, planet mars in the background, floating beside planets, space art'" token_ind=5 result_path='./logs/astronaut_sive' seed=116740
VPSD
โ๏ธ Iconography style
Prompt: Sydney opera house. oil painting. by Van Gogh
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=iconography "prompt='Sydney opera house. oil painting. by Van Gogh'" result_path='./logs/SydneyOperaHouse-OilPainting' state.mprec='fp16'
โ๏ธ Painting style
Prompt: Abstract Vincent van Gogh Oil Painting Elephant, featuring earthy tones of green and brown
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=painting "prompt='Abstract Vincent van Gogh Oil Painting Elephant, featuring earthy tones of green and brown.'" x.num_paths=256 result_path='./logs/Elephant-OilPainting'
โ๏ธ Pixel-Art style
Prompt: Darth vader with lightsaber
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=pixelart "prompt='Darth vader with lightsaber.'" result_path='./logs/DarthVader'
โ๏ธLow-poly style
Prompt: A picture of a bald eagle. low-ploy. polygon. minimal flat 2d vector
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=lowpoly "prompt='A picture of a bald eagle. low-ploy. polygon. minimal flat 2d vector'" neg_prompt='' result_path='./logs/BaldEagle'
โ๏ธ Sketch style
Prompt: A free-hand drawing of A speeding Lamborghini. black and white drawing.
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=sketch "prompt='A free-hand drawing of A speeding Lamborghini. black and white drawing.'" neg_prompt='' result_path='./logs/Lamborghini'
โ๏ธ Ink and Wash style
Prompt: Big Wild Goose Pagoda. ink style. Minimalist abstract art grayscale watercolor. empty background
Preview:
Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 |
---|---|---|---|---|---|
init p1 | init p2 | init p3 | init p4 | init p5 | init p6 |
final p1 | final p2 | final p3 | final p4 | final p5 | final p6 |
Script:
python svgdreamer.py x=ink "prompt='Big Wild Goose Pagoda. ink style. Minimalist abstract art grayscale watercolor. empty background'" neg_prompt='' result_path='./logs/BigWildGoosePagoda'
๐จ Supported Styles
For more examples, visit Examples.md.
๐ Tips for Best Results
- I highly recommend turning on xformer
enable_xformers=True
to speed up optimization. x.vpsd.t_schedule
greatly affects the style of the result. Please try more.neg_prompt
negative prompts affect the quality of the results- By setting
state.mprec='fp16'
, you can significantly reduce GPU memory usage.
๐ TODO
- Release the code.
- Add docker image.
- Support fp16 optimization.
:books: Acknowledgement
The project is built based on the following repository:
- BachiLi/diffvg
- huggingface/diffusers
- ximinng/DiffSketcher
- THUDM/ImageReward
- ximinng/PyTorch-SVGRender
We gratefully thank the authors for their wonderful works.
:paperclip: Citation
If you use this code for your research, please cite the following work:
@InProceedings{svgdreamer_xing_2023,
author = {Xing, Ximing and Zhou, Haitao and Wang, Chuang and Zhang, Jing and Xu, Dong and Yu, Qian},
title = {SVGDreamer: Text Guided SVG Generation with Diffusion Model},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {4546-4555}
}
:copyright: Licence
This work is licensed under a MIT License.