SVGDreamer: Text-Guided SVG Generation with Diffusion Model

SVGDreamer is an advanced text-to-SVG generation model that creates high-quality vector graphics using a multi-particle optimization approach. It generates multiple SVG variants simultaneously, allowing for diverse and creative outputs.

Model Description

SVGDreamer leverages Stable Diffusion to guide the generation of vector graphics through a novel multi-particle system. The model optimizes multiple SVG representations in parallel, enabling exploration of different artistic interpretations of the same text prompt.

Key Features

Multi-Particle Generation: Creates multiple SVG variants simultaneously
Style Control: Supports different artistic styles (iconography, pixel art, sketch, painting)
High Quality: Produces detailed and aesthetically pleasing vector graphics
Flexible Parameters: Extensive customization options for fine-tuning output

Usage

Direct API Call

import requests

API_URL = "https://api-inference.huggingface.co/models/jree423/svgdreamer"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": "a majestic eagle soaring through clouds",
    "parameters": {
        "n_particle": 6,
        "num_iter": 1000,
        "guidance_scale": 7.5,
        "style": "iconography",
        "width": 224,
        "height": 224,
        "seed": 42
    }
})

Using the Inference Client

from huggingface_hub import InferenceClient

client = InferenceClient("jree423/svgdreamer")
result = client.post(
    json={
        "inputs": "a cyberpunk cityscape at night",
        "parameters": {
            "n_particle": 4,
            "style": "pixel_art",
            "guidance_scale": 8.0
        }
    }
)

Parameters

n_particle (int, default: 6): Number of SVG particles to generate. Each particle represents a different interpretation of the prompt.
num_iter (int, default: 1000): Number of optimization iterations. More iterations improve quality but take longer.
guidance_scale (float, default: 7.5): Controls how closely the generation follows the text prompt.
width (int, default: 224): Output SVG width in pixels.
height (int, default: 224): Output SVG height in pixels.
seed (int, default: 42): Random seed for reproducible results.
style (string, default: "iconography"): Style of the generated SVG. Options: "iconography", "pixel_art", "sketch", "painting".

Output Format

The model returns a list of JSON objects, one for each particle, containing:

particle_id: Unique identifier for the particle
svg: The generated SVG content as a string
svg_base64: Base64 encoded SVG for easy transmission
prompt: The input text prompt
style: The style used for generation
parameters: The parameters used for generation

Styles

Iconography

Clean, minimalist vector graphics suitable for icons and logos.

Example: "a simple house icon"

Pixel Art

Retro-style graphics with pixelated aesthetics.

Example: "a pixel art character"

Sketch

Hand-drawn style with organic lines and artistic flair.

Example: "a sketch of a mountain landscape"

Painting

Rich, painterly style with complex color gradients.

Example: "an oil painting of a sunset"

Examples

Nature Scenes

"a forest with tall pine trees"
"ocean waves crashing on rocks"
"a field of sunflowers under blue sky"

Characters and Objects

"a friendly robot character"
"a vintage bicycle"
"a magical wizard casting spells"

Abstract Art

"geometric patterns in bright colors"
"flowing organic shapes"
"mandala design with intricate details"

Technical Details

Base Model: Stable Diffusion 2.1
Framework: PyTorch + Diffusers
Vector Rendering: DiffVG (differentiable vector graphics)
Optimization: Multi-particle VPSD (Vector Particle-based Score Distillation)
Parallel Processing: Simultaneous optimization of multiple SVG representations

Citation

@inproceedings{xing2024svgdreamer,
  title={SVGDreamer: Text Guided SVG Generation with Diffusion Model},
  author={Xing, XiMing and others},
  booktitle={CVPR},
  year={2024}
}

License

This model is released under the MIT License.