sebastiansarasti
/

DDPM_MNIST2

Safetensors

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

sebastiansarasti commited on Jan 31

Commit

41e7405

verified ·

1 Parent(s): 8f36890

Update README.md

Browse files

Files changed (1) hide show

README.md +92 -0

README.md CHANGED Viewed

@@ -4,6 +4,98 @@ tags:
 - pytorch_model_hub_mixin
 ---
 This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
 - Library: [More Information Needed]
 - Docs: [More Information Needed]

 - pytorch_model_hub_mixin
 ---
+# Model Card: Time-Conditioned U-Net for MNIST
+## Model Details
+- **Architecture**: Time-Conditioned U-Net
+- **Dataset**: [Comic Faces Paired Synthetic](https://www.kaggle.com/datasets/defileroff/comic-faces-paired-synthetic)
+- **Batch Size**: 256
+- **Image Size**: 28x28
+- **Loss Function**: Mean Squared Error (MSE)
+- **Optimizer**: Adam (learning rate = 1e-4)
+## Model Architecture
+This model is a U-Net-based neural network that incorporates time conditioning using sinusoidal embeddings and an MLP. The architecture is designed for small grayscale images (e.g., MNIST) and consists of:
+### **Encoder (Contracting Path)**:
+- **Downsampling** using three `DoubleConv` layers with 32, 64, and 128 channels, respectively.
+- Time embedding added at each convolution block.
+- **Max pooling** used to reduce spatial dimensions.
+### **Decoder (Expanding Path)**:
+- **Upsampling** via bilinear interpolation.
+- Skip connections from encoder layers to corresponding decoder layers.
+- Two `DoubleConv` layers with 128+64 and 64+32 channels, respectively.
+- Final `1x1` convolution to map to the output.
+### **Time Embedding**:
+- Uses a sinusoidal positional encoding to represent timestep information.
+- An MLP refines the embedding before passing it to convolutional layers.
+## Implementation
+### **Generator (U-Net)**
+```python
+class UNet(nn.Module, PyTorchModelHubMixin):
+    def __init__(self, in_channels=1, out_channels=1, time_embedding_dim=32):
+        super(UNet, self).__init__()
+        # Time embedding layer
+        self.time_embedding = TimeEmbedding(time_embedding_dim)
+        # Encoder
+        self.down_conv1 = DoubleConv(in_channels, 32, time_embedding_dim)
+        self.down_conv2 = DoubleConv(32, 64, time_embedding_dim)
+        self.down_conv3 = DoubleConv(64, 128, time_embedding_dim)
+        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
+        self.upsample = nn.Upsample(scale_factor=2, mode="bilinear", align_corners=True)
+        # Decoder
+        self.up_conv2 = DoubleConv(128 + 64, 64, time_embedding_dim)
+        self.up_conv1 = DoubleConv(64 + 32, 32, time_embedding_dim)
+        self.final_conv = nn.Conv2d(32, out_channels, kernel_size=1)
+    def forward(self, x, timesteps):
+        t = self.time_embedding(timesteps)
+        x1 = self.down_conv1(x, t)
+        x2 = self.down_conv2(self.maxpool(x1), t)
+        x3 = self.down_conv3(self.maxpool(x2), t)
+        x = self.upsample(x3)
+        x = torch.cat([x2, x], dim=1)
+        x = self.up_conv2(x, t)
+        x = self.upsample(x)
+        x = torch.cat([x1, x], dim=1)
+        x = self.up_conv1(x, t)
+        return self.final_conv(x)
+```
+Time Embedding
+```python
+class TimeEmbedding(nn.Module):
+    def __init__(self, embedding_dim):
+        super().__init__()
+        self.mlp = nn.Sequential(
+            nn.SiLU(),
+            nn.Linear(embedding_dim, embedding_dim),
+        )
+    def forward(self, t):
+        half_dim = self.embedding_dim // 2
+        embeddings = torch.exp(torch.arange(half_dim, device=t.device) * -(torch.log(torch.tensor(10000.0)) / (half_dim - 1)))
+        embeddings = t[:, None] * embeddings[None, :]
+        embeddings = torch.cat((embeddings.sin(), embeddings.cos()), dim=-1)
+        return self.mlp(embeddings)
+```
+## Training Configuration
+- Batch Size: 256
+- Image Size: 28x28
+- Loss Function: Mean Squared Error (MSE)
+- Optimizer: Adam (learning rate = 1e-4)
 This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
 - Library: [More Information Needed]
 - Docs: [More Information Needed]