metadata

language: py
license: mit
tags:
  - image-reconstruction
  - autoencoder
  - pytorch
  - generative-model

CosAE Convolutional Harmonic Autoencoder

This is a pretrained Convolutional Harmonic Autoencoder (CosAE) model. It encodes images into amplitude/phase harmonics and reconstructs RGB images.

Usage

from transformers import AutoModel

# Load the model with remote code trust
model = AutoModel.from_pretrained(
    "vedant-jumle/cosae",
    trust_remote_code=True,
)
model.eval()

# Example input: tensor of shape [B, 9, H, W] (RGB + FFT) or [B,3,H,W]
import torch
x = torch.randn(1, 9, 256, 256)
with torch.no_grad():
    recon = model(x)

Model Details

Architecture: Convolutional encoder (ResBlocks + optional attention), Harmonic Construction Module, upsampling decoder
Input channels: 9 (3 RGB + 6 FFT) or 3
Image size: 256×256 (configurable)

References

Sifei et al. (2024). CosAE: Convolutional Harmonic Autoencoder. NVIDIA AMRI. https://research.nvidia.com/labs/amri/publication/sifei2024cosae/

License

This model is released under the MIT License. See the repository LICENSE for details.