Model Card for Unit 1 of the Diffusion Models Class 🧨

This model is a diffusion model for unconditional image generation of beer images 🍺.

Bild 1 Bild 2 Bild 3

Model Description

This model is based on the DDPM (Denoising Diffusion Probabilistic Models) architecture and has been specifically trained to generate images of beer. It employs a UNet architecture with self-attention mechanisms in the middle layers.

Model Architecture

  • Type: UNet2DModel with self-attention
  • Input Channels: 3 (RGB)
  • Output Channels: 3
  • Image Resolution: 32x32 pixels
  • Layers per Block: 2
  • Channel Dimensions: (64, 128, 128, 256)
  • Attention Layers: Present in middle down and up blocks

Usage

from diffusers import DDPMPipeline

# Load the model
pipeline = DDPMPipeline.from_pretrained('ffjefckds/sd-class-beer-32')

# Generate an image
image = pipeline().images[0]
image.save("generated_beer.png")

Training

Training Data

The model was trained on a synthetic dataset of beer images, which is available at ffjefckds/small-beer-images.

Training Procedure

  • Optimizer: AdamW
  • Learning Rate: 4e-4
  • Epochs: 500
  • Noise Scheduler: DDPM with 1000 timesteps
  • Beta Schedule: "squaredcos_cap_v2"
  • Batch Size: Dynamic based on dataloader

Training Infrastructure

  • Framework: PyTorch
  • Library: 🧨 Diffusers

Limitations and Bias

  • The model is trained on a low resolution (32x32)
  • Synthetic training data might lead to limited diversity
  • The generated images are too small for most practical applications due to the low resolution
  • The model may exhibit biases present in the training data

License

MIT License

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support