Diffusion Models: Complete DDPM Implementation

A comprehensive PyTorch implementation of Denoising Diffusion Probabilistic Models (DDPM) with detailed mathematical foundations and educational content.

Model Description

This repository contains a complete implementation of Diffusion Models (DDPM) trained on 2D synthetic datasets. The model learns to generate new data points by mastering the art of noise removal through a reverse diffusion process. This implementation serves as both a working model and an educational resource for understanding the mathematics and implementation of diffusion models.

Architecture Details

  • Model Type: Denoising Diffusion Probabilistic Model (DDPM)
  • Framework: PyTorch
  • Input: 2D point coordinates
  • Diffusion Steps: 1000 timesteps
  • Hidden Dimensions: 256 units with SiLU activations
  • Time Embedding: 64-dimensional rich representations
  • Total Parameters: ~130K
  • Model Size: 1.8MB

Key Components

  1. Noise Predictor Network: Neural network that predicts noise ε_θ(x_t, t)
  2. Forward Diffusion Process: Gradually adds Gaussian noise over T steps
  3. Reverse Diffusion Process: Iteratively removes noise to generate samples
  4. Time Embedding Module: Converts timesteps to rich feature representations

Training Details

  • Dataset: Synthetic 2D point clusters
  • Diffusion Steps: 1000
  • Beta Schedule: Linear (0.0001 to 0.02)
  • Optimizer: AdamW with cosine annealing
  • Learning Rate: 0.001
  • Training Epochs: 2000
  • Batch Processing: Dynamic batching for efficient training

Mathematical Foundation

Forward Process

The forward process adds noise according to:

q(x_t | x_{t-1}) = N(x_t; √(1-β_t) x_{t-1}, β_t I)

With direct sampling:

x_t = √ᾱ_t x_0 + √(1-ᾱ_t) ε

Reverse Process

The model learns to reverse noise:

p_θ(x_{t-1} | x_t) = N(x_{t-1}; μ_θ(x_t, t), Σ_θ(x_t, t))

Loss Function

Trained by minimizing noise prediction error:

L = E[||ε - ε_θ(x_t, t)||²]

Model Performance

Training Metrics

  • Final Training Loss: Converged to stable low values
  • Training Time: ~30 minutes on GPU
  • Memory Usage: <500MB GPU memory
  • Convergence: Stable training without mode collapse

Capabilities

  • ✅ High-quality 2D point generation
  • ✅ Smooth interpolation in data space
  • ✅ Stable training without adversarial dynamics
  • ✅ Mathematically grounded approach
  • ✅ Excellent sample diversity

Usage

Quick Start

import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# Load the model components (full implementation in notebook)
class NoisePredictor(nn.Module):
    def __init__(self, data_dim=2, hidden_dim=256, time_embed_dim=64):
        super(NoisePredictor, self).__init__()
        # ... (complete implementation in notebook)
    
    def forward(self, x, t):
        # ... (complete implementation in notebook)
        return noise_prediction

class DiffusionModel:
    def __init__(self, T=1000, beta_start=0.0001, beta_end=0.02):
        # ... (complete implementation in notebook)
    
    def sample(self, n_samples=100):
        # Generate new samples from pure noise
        # ... (complete implementation in notebook)
        return generated_samples

# Load trained model
model = DiffusionModel()
# Load weights: model.model.load_state_dict(torch.load('diffusion_model_complete.pth'))

# Generate new samples
samples = model.sample(n_samples=100)
plt.scatter(samples[:, 0], samples[:, 1])
plt.title("Generated 2D Points")
plt.show()

Advanced Usage

# Visualize the diffusion process
model.visualize_diffusion_process()

# Monitor training progress
model.plot_training_curves()

# Sample with different parameters
high_quality_samples = model.sample(n_samples=500, guidance_scale=1.0)

Visualizations Available

  1. Diffusion Process: Step-by-step noise addition and removal
  2. Training Curves: Loss evolution and learning dynamics
  3. Generated Samples: Comparison with original data distribution
  4. Sampling Process: Real-time generation visualization
  5. Parameter Analysis: Beta schedule and noise analysis

Files and Outputs

  • Diffusion Models.ipynb: Complete implementation with educational content
  • diffusion_model_complete.pth: Trained model weights
  • diffusion_process.png: Visualization of forward and reverse processes
  • diffusion_results.png: Generated samples and quality assessment
  • training_metrics.png: Comprehensive training analytics
  • diffusion_logs/: Detailed training and sampling logs

Applications

This diffusion model implementation can be adapted for:

  • Image Generation: Extend to pixel-based image synthesis
  • Audio Synthesis: Apply to waveform or spectrogram generation
  • 3D Point Clouds: Generate 3D shapes and objects
  • Time Series: Financial data, sensor readings, weather patterns
  • Scientific Data: Molecular structures, particle physics
  • Data Augmentation: Synthetic training data creation

Educational Value

This implementation is designed as a learning resource featuring:

  • Complete Mathematical Derivations: From first principles to implementation
  • Step-by-Step Explanations: Every component explained in detail
  • Visual Learning: Rich plots and animations for understanding
  • Progressive Complexity: Build understanding gradually
  • Practical Implementation: Real working code with best practices

Research Applications

The model demonstrates key concepts in:

  • Generative Modeling: Alternative to GANs and VAEs
  • Probability Theory: Markov chains and stochastic processes
  • Neural Network Architecture: Time conditioning and embeddings
  • Optimization: Stable training of generative models
  • Sampling Methods: DDPM and potential DDIM extensions

Comparison with Other Generative Models

Advantages over GANs

  • ✅ Stable training (no adversarial dynamics)
  • ✅ No mode collapse
  • ✅ Mathematical foundation
  • ✅ High-quality samples

Advantages over VAEs

  • ✅ Higher sample quality
  • ✅ No posterior collapse
  • ✅ Better likelihood estimates
  • ✅ Flexible architectures

Trade-offs

  • ⚠️ Slower sampling (requires multiple steps)
  • ⚠️ More computationally intensive
  • ⚠️ Memory requirements for long sequences

Citation

If you use this implementation in your research or projects, please cite:

@misc{ddpm_implementation_2024,
  title={Complete DDPM Implementation: Educational Diffusion Models},
  author={Gruhesh Kurra},
  year={2024},
  url={https://huggingface.co/karthik-2905/DiffusionModels}
}

Future Extensions

Planned improvements and extensions:

  • 🔄 DDIM Implementation: Faster sampling with deterministic steps
  • 🎨 Conditional Generation: Text-guided or class-conditional generation
  • 📊 Alternative Schedules: Cosine and sigmoid beta schedules
  • 🖼️ Image Diffusion: Extension to CIFAR-10 and other image datasets
  • 🎵 Audio Applications: Waveform and spectrogram generation
  • 🧬 Scientific Applications: Molecular and protein structure generation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Additional Resources

  • GitHub Repository: DiffusionModels
  • Detailed Notebook: Complete implementation with educational content
  • Training Logs: Comprehensive metrics and analysis

Model Card Authors

Gruhesh Kurra - Implementation, documentation, and educational content


Tags: diffusion-models, generative-ai, pytorch, ddpm, deep-learning, denoising

Model Card Last Updated: December 2024

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support