π°οΈ Model Card: downstream-satvision-toa-3dclouds
βοΈ Model Overview
- Model Name:
downstream-satvision-toa-3dclouds - Base Model: SatVision-TOA (Giant, 3B parameters)
- Architecture: SwinV2 Transformer (ViT backbone)
- Pretraining Objective: Masked Image Modeling (MIM)
- Pretraining Dataset: 100M globally-distributed MODIS TOA image chips across 14 bands
- Resolution: 128Γ128 px at ~1β―km
- Pretraining Conditions: All-sky (cloud, aerosol, ocean, land)
ποΈ Intended Use
- Task: 3D cloud vertical reconstruction from satellite TOA imagery
- Downstream Data: GOES-ABI chips paired with CloudSat/CALIPSO cloud curtain observations
- Output: Per-pixel cloud vertical class (e.g., cloud top/base detection, multilayer structure)
π§ Strengths
- Learns spatial-spectral relationships across diverse global conditions
- Generalizes well across sensors (MODIS β GOES-ABI)
- Outperforms baseline on thin, multilayer, and obscured clouds
- Pretraining improves sample efficiency for fine-tuning
β οΈ Limitations
- Temporal bias: Terra-MODIS sampling (~9 AM local) may limit temporal generalization
- Resolution: Only supports ~1 km scale chips; sub-km cloud structures not resolved
- Sensor adaptation: While GOES-ABI is supported, optimal results may require minor domain tuning
π οΈ Fine-Tuning & Usage
- Decoder: Lightweight FCN head on frozen SatVision-TOA encoder
- Training Data: ~7,000 labeled GOES-ABI chips aligned with CloudSat/CALIPSO
- Validation Set: 1,300 chips
- Typical Inference Output: 2D maps of vertical cloud structure per chip
π Adaptation Ideas
- Extend to aerosol, water vapor, or ice phase classification
- Fine-tune on nighttime or different orbital sensors (e.g., VIIRS, Himawari)
- Use as encoder backbone for multitask satellite cloud analysis
π Citation
If you use this model, please cite:
@article{satvision2024, title={SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery}, author={Zhu, Le and Caraballo-Vega, Jordan and Gentine, Pierre and Tao, Wenzhong and et al.}, journal={arXiv preprint arXiv:2406.06561}, year={2024} }
π Resources
π Summary:
This model leverages a powerful foundation transformer trained on MODIS TOA data to deliver high-fidelity 3D cloud reconstructions from GOES-ABI imagery. It serves as a critical step toward operational cloud analysis from geostationary satellites using foundation model paradigms.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support