RQUNet-DPC — Dense Predictive Coding + U-Net for Time-Series Satellite Segmentation

Dense Predictive Coding (DPC) paired with U-Net for spatiotemporal segmentation on satellite image time series, plus baseline 3D-UNet, ConvLSTM, and ConvGRU checkpoints.

Repository: trile83/RQUNet-DPC (GitHub) Maintainer (HF org): NASA CISTO Data Science Group License: Apache-2.0 (this repo) (Hugging Face)


Model summary

RQUNet-DPC is an open framework for segmenting satellite image time series using:

  • DPC + U-Net (primary model): A dense predictive coding front-end driving a U-Net segmentation head (3D convolutional decoder), designed for spatiotemporal feature learning. (GitHub)
  • Baselines: 3D-UNet, ConvLSTM, and ConvGRU for comparison and ablation. (GitHub)

The framework expects time-series cubes and provides training and sliding-window inference utilities. Commands and data conventions are documented in the GitHub README. (GitHub)


Model Index (artifacts in this repo)

File Model Bands Loss / Seg. Head (from filename) Epoch
dpc-unet-2024-12-05-crossentropy_conv3d_std_None_200_0.038_0.0binary_10band_epoch108.pth DPC-UNet 10 CrossEntropy, Conv3D head, std=None 108 (Hugging Face)
dpc-unet-2025-04-15-crossentropy_conv3d_std_None_200_0.174_0.4binary_4band_epoch30.pth DPC-UNet 4 CrossEntropy, Conv3D head, std=None 30 (Hugging Face)
3d-unet_2024-11-25_10band_0.109_epoch_13.pth 3D-UNet 10 (3D-UNet baseline) 13 (Hugging Face)
convlstm_2024-12-02_10band_0.036_epoch_119.pth ConvLSTM 10 (ConvLSTM baseline) 119 (Hugging Face)
convlstm_2025-05-06_4band_0.081_epoch_32.pth ConvLSTM 4 (ConvLSTM baseline) 32 (Hugging Face)
convgru_2024-12-02_10band_0.02_epoch_72.pth ConvGRU 10 (ConvGRU baseline) 72 (Hugging Face)

Note on filenames: Many training hyperparameters are encoded in each filename (e.g., crossentropy, conv3d, std_None, 10band/4band, and epochN). See training commands below for the canonical setup. (GitHub)


Intended use & limitations

  • Use cases: Land-cover / land-use segmentation and similar tasks on multi-temporal satellite stacks (e.g., HLS). The pipeline supports tiling and sliding-window inference for large rasters. (GitHub)
  • Inputs: Time-series arrays organized as T × C × H × W (time, channels, height, width) during preprocessing; the segmenter uses a Conv3D head. (GitHub)
  • Limitations: Performance depends on dataset domain, temporal sampling, band selection, and pre-/post-processing choices (standardization, tiling strategy). Checkpoints were trained on specific splits; transfer to other geographies/years may require finetuning.

How to use

1) Environment & data prep

Follow the repo’s environment and preprocessing steps:

# Create environment (from repo)
conda env create -f environment.yml
conda activate env

# Convert images to time-series datacubes (HDF5 / PT format)
python RQUNet-DPC/models/create_timeseries.py

See the README for data format expectations and utilities. (GitHub)

2) Inference (sliding-window) with provided scripts

Use the provided prediction script for tiled or sliding-window inference:

# Non-overlap small-tile prediction (example)
python RQUNet-DPC/models/predict_nonoverlap.py \
  --img_dim 64 \
  --model dpc-unet \
  --segment_model conv3d \
  --ts_length 16 \
  --dataset PEV \
  --net unet \
  --channels 10 \
  --standardization None \
  --rescale None \
  --saveproba False \
  --addindices False

# Sliding-window prediction on large rasters (example)
python RQUNet-DPC/models/predict_nonoverlap.py \
  --img_dim 64 \
  --model 3d-unet \
  --ts_length 16 \
  --dataset PEV_large_2019 \
  --channels 10 \
  --standardization None \
  --rescale None

These commands mirror the project’s documented usage. Adjust --model, --channels, and --ts_length to match the checkpoint you download here. (GitHub)

3) Programmatic loading (PyTorch)

The state dicts correspond to architectures defined in the GitHub repo; import the matching model class before loading.

import torch

# Example: build your model to match the checkpoint architecture/hparams
# (Replace with actual constructors from the repo modules.)
from RQUNet_DPC_like import build_dpc_unet  # placeholder import; see repo

ckpt_path = "dpc-unet-2024-12-05-crossentropy_conv3d_std_None_200_0.038_0.0binary_10band_epoch108.pth"
model = build_dpc_unet(in_channels=10, ts_length=16, segment_model="conv3d")  # match bands & T
state = torch.load(ckpt_path, map_location="cpu")
model.load_state_dict(state)
model.eval()

# Input should align with training: e.g., (B, C, T, H, W) for 3D backbones or
# repo’s internal reshaping of (T, C, H, W). See training/prediction scripts.
x = torch.randn(1, 10, 16, 64, 64)
with torch.no_grad():
    logits = model(x)  # segmentation logits

For out-of-the-box usage, prefer the repo’s predict_nonoverlap.py CLI to ensure pre/post-processing matches the training setup. (GitHub)


Training

The repo provides training entry points (examples):

# Train DPC + U-Net (Conv3D segment head)
python RQUNet-DPC/models/train_dpc_seg_nonoverlap.py \
  --img_dim 64 --epochs 150 --standardization None \
  --segment_model conv3d --ts_length 16 --net unet --channels 10 \
  --loss dice \
  --noncrop_pct 0.7 --noncrop_thresh 0.7 --crop_thresh 0.2 \
  --num_chips 50 --rescale None --num_val 10 --addindices False

# Baselines: 3D-UNet, ConvLSTM, ConvGRU
python RQUNet-DPC/models/train_benchmodel.py --model 3d-unet --img_dim 64 --epochs 120 \
  --standardization None --noncrop_pct 0.1 --noncrop_thresh 0.3 --crop_thresh 0.5 --num_chips 50

python RQUNet-DPC/models/train_benchmodel.py --model convlstm --img_dim 64 --epochs 120 \
  --standardization None --noncrop_pct 0.1 --noncrop_thresh 0.3 --crop_thresh 0.5 --num_chips 50

python RQUNet-DPC/models/train_benchmodel.py --model convgru --img_dim 64 --epochs 120 \
  --standardization None --noncrop_pct 0.1 --noncrop_thresh 0.3 --crop_thresh 0.5 --num_chips 50

These reflect the instructions in the repo README; consult it for the latest flags and dataset setup (time-series chips, standardization, rescaling, indices). (GitHub)


Data

The framework is demonstrated on HLS (Harmonized Landsat-Sentinel) time series in the documentation, with examples of tiling and temporal stacking. You’ll need to prepare your own data in the expected T×C×H×W format via the provided preprocessing script(s). (GitHub)


Evaluation

  • Primary metrics depend on task configuration (e.g., cross-entropy or dice for segmentation).
  • Use consistent tiling/standardization settings between train and inference to avoid distribution shift.
  • Sliding-window inference is provided for large rasters. (GitHub)

Limitations & ethical considerations

  • Models trained on specific geographies/periods may not generalize; assess shift sensitivity before operational use.
  • Satellite data can encode sensitive land-use signals; ensure compliance with local data-use policies and avoid misuse.

Citation

If you use these models, please cite the GitHub repository and this model card:

  • RQUNet-DPC repository: Dense Predictive Coding and UNet framework for satellite images time series segmentation. (GitHub)
  • Model artifacts: Hugging Face: nasa-cisto-data-science-group/RQUNet-DPC. (Hugging Face)

Acknowledgements

  • Original code and training scripts: trile83/RQUNet-DPC. (GitHub)
  • Hosting: NASA CISTO Data Science Group HF org. (Hugging Face)

Changelog

  • 2025-09-04: Initial model card reflecting checkpoints currently in the HF repository (3D-UNet, ConvLSTM/ConvGRU baselines, and DPC-UNet in 4-band and 10-band variants). (Hugging Face)

If you want, I can tailor the “How to use” section with exact import paths once you confirm the model class/module names you’d like people to use (e.g., the specific constructor for DPC-UNet from your repo).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support