# SVGDreamer: Text Guided SVG Generation with Diffusion Model [![CVPR 2024](https://img.shields.io/badge/CVPR%202024-Paper-4169E1?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2312.16476) [![arXiv](https://img.shields.io/badge/arXiv-2312.16476-8A2BE2?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2312.16476) [![Project Website](https://img.shields.io/badge/Website-Project%20Page-4682B4?style=for-the-badge&logo=github&logoColor=white)](https://ximinng.github.io/SVGDreamer-project/) [![English Blog](https://img.shields.io/badge/Blog-English-00CED1?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/blog/xingxm/svgdreamer) [![中文博客](https://img.shields.io/badge/博客-中文-1E90FF?style=for-the-badge&logo=zhihu&logoColor=white)](https://zhuanlan.zhihu.com/p/687525994) This repository contains the official implementation of our CVPR 2024 paper, "SVGDreamer: Text-Guided SVG Generation with Diffusion Model." The method leverages a diffusion-based approach to produce high-quality SVGs guided by text prompts. ![title](./assets/illustrate.png) ![title](./assets/teaser_svg_asset.png) ## :new: Latest Update - [11/2024] 🔥 **We released the [SVGDreamer++](https://arxiv.org/abs/2411.17832), offering stronger visual representation and improved editing capabilities.** - [03/2024] 🔥 We released the **code** for [SVGDreamer](https://ximinng.github.io/SVGDreamer-project/). - [02/2024] 🎉 SVGDreamer accepted by CVPR2024. 🎉 - [12/2023] 🔥 We released the **[SVGDreamer Paper](https://arxiv.org/abs/2312.16476)**. SVGDreamer is a novel text-guided vector graphics synthesis method. This method considers both the editing of vector graphics and the quality of the synthesis. ## 📌 Installation Guide ### 🛠️ Step 1: Set Up the Environment To quickly get started with **SVGDreamer**, follow the steps below. These instructions will help you run **quick inference locally**. #### 🚀 **Option 1: Standard Installation** Run the following command in the **top-level directory**: ```shell chmod +x script/install.sh bash script/install.sh ``` #### 🐳 Option 2: Using Docker ```shell chmod +x script/run_svgdreamer_docker.sh sudo bash script/run_svgdreamer_docker.sh ``` ### 🛠️ Step 2: Download Pretrained Stable Diffusion Model SVGDreamer requires a pretrained Stable Diffusion (SD) model. You can download it automatically or manually. #### 🔄 Option 1: Auto-Download (Recommended) Set `diffuser.download=True` in `/conf/config.yaml` before running SVGDreamer. Alternatively, append `diffuser.download=True` to the execution script. #### ⬇️ Option 2: Manual Download If you prefer manual setup, download the model from Hugging Face: 🔗 Model Link: [Stable Diffusion 2.1 Base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) The model will be stored at: 📁 Default Path: `/home/user/.cache/huggingface/hub/models--stabilityai--stable-diffusion-2-1-base` ## 🔥 Quickstart: synthesize **6** SVGs at once ### SIVE + VPSD **Prompt:** an image of Batman. full body action pose, complete detailed body, white background, high quality, 4K, ultra realistic
**Preview:** | Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 | |:---------------------------------------------:|:---------------------------------------------:|:---------------------------------------------:|:--------------------------------------------:|:---------------------------------------------:|:---------------------------------------------:| | init p1 | init p2 | init p3 | init p4 | init p5 | init p6 | |

| **Script:** ```shell python svgdreamer.py x=iconography skip_sive=False "prompt='an image of Batman. full body action pose, complete detailed body. white background. empty background, high quality, 4K, ultra realistic'" token_ind=4 x.vpsd.t_schedule='randint' result_path='./logs/batman' multirun=True ``` 🔹Parameter: - `x=iconography`(str): style configs - `skip_sive`(bool): enable the SIVE stage - `token_ind`(int): the index of text prompt, from 1 - `result_path`(str): the path to save the result - `multirun`(bool): run the script multiple times with different random seeds - `mv`(bool): save the intermediate results of the run and record the video (This increases the run time) More parameters in `./conf/x/style.yaml`, you can modify these parameters from the command line. For example, append `x.vpsd.n_particle=4` to the end of the script. ### SIVE **Prompt:** an astronaut walking across a desert, planet mars in the background, floating beside planets, space art
**Preview:** | attn-map | bg init | fg init | bg final | fg final | final | |:----------------------------------------------:|:-------------------------------------------------:|:-------------------------------------------------:|:--------------------------------------------------:|:--------------------------------------------------:|:------------------------------------------------:| |

| **Script:** ```shell python svgdreamer.py x=iconography-s1 skip_sive=False "prompt='an astronaut walking across a desert, planet mars in the background, floating beside planets, space art'" token_ind=5 result_path='./logs/astronaut_sive' seed=116740 ``` ### VPSD #### ✍️ Iconography style **Prompt:** Sydney opera house. oil painting. by Van Gogh
**Preview:** | Particle 1 | Particle 2 | Particle 3 | Particle 4 | Particle 5 | Particle 6 | |:------------------------------------------------------:|:------------------------------------------------------:|:------------------------------------------------------:|:------------------------------------------------------:|:------------------------------------------------------:|:------------------------------------------------------:| | init p1 | init p2 | init p3 | init p4 | init p5 | init p6 | |

| **Script:** ```shell python svgdreamer.py x=ink "prompt='Big Wild Goose Pagoda. ink style. Minimalist abstract art grayscale watercolor. empty background'" neg_prompt='' result_path='./logs/BigWildGoosePagoda' ``` #### 🎨 Supported Styles **For more examples, visit [Examples.md](https://github.com/ximinng/DiffSketcher/blob/main/Examples.md)**. ## 🔑 Tips for Best Results - I highly recommend turning on xformer `enable_xformers=True` to speed up optimization. - `x.vpsd.t_schedule` greatly affects the style of the result. Please try more. - `neg_prompt` negative prompts affect the quality of the results - By setting `state.mprec='fp16'`, you can significantly reduce GPU memory usage. ## 📋 TODO - [x] Release the code. - [x] Add docker image. - [x] Support fp16 optimization. ## :books: Acknowledgement The project is built based on the following repository: - [BachiLi/diffvg](https://github.com/BachiLi/diffvg) - [huggingface/diffusers](https://github.com/huggingface/diffusers) - [ximinng/DiffSketcher](https://github.com/ximinng/DiffSketcher) - [THUDM/ImageReward](https://github.com/THUDM/ImageReward) - [ximinng/PyTorch-SVGRender](https://github.com/ximinng/PyTorch-SVGRender) We gratefully thank the authors for their wonderful works. ## :paperclip: Citation If you use this code for your research, please cite the following work: ``` @InProceedings{svgdreamer_xing_2023, author = {Xing, Ximing and Zhou, Haitao and Wang, Chuang and Zhang, Jing and Xu, Dong and Yu, Qian}, title = {SVGDreamer: Text Guided SVG Generation with Diffusion Model}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {4546-4555} } ``` ## :copyright: Licence This work is licensed under a MIT License.