MORPH 1.0
MORPH: Shape-agnostic PDE Foundation Models. Paper, GitHub Repo
Highlights:
- We introduce MORPH, a PDE foundation model designed to accommodate heterogeneous data across diverse physical phenomenon
- MORPH is shape-agnostic (1D/2D/3D, varying resolutions, fields with scalar/vector components), with physics-aware channel handling of PDE datasets.
- MORPH employs a larger transformer architecture with one cross-attention and four axial attention modules that attend over a multi-fold increase in spatiotemporal patches (i.e., a larger context window).
- We pretrain and fine-tune on a broad (3 benchmarks) heterogeneous suite including multi-physics datasets like magnetohydrodynamics (MHD), turbulent self-gravitating flows with cooling (TGC), high-resolution 2D compressible and incompressible NavierโStokes and large-scale 3D datasets.
- It's an autoregressive, flexible and powerful backbone for scalable and data-efficient scientific machine learning.
๐ช๐ต๐ฎ๐โ๐ ๐ป๐ฒ๐:
- ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ ๐ฏ๐๐ถ๐น๐ ๐ณ๐ผ๐ฟ ๐ฝ๐ต๐๐๐ถ๐ฐ๐: An autoregressive vision transformer backbone with local convolutions, inter-field cross-attention, and efficient 4D axial-attentions for global space-time context.
- ๐ข๐ป๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น, ๐บ๐ฎ๐ป๐ ๐๐ต๐ฎ๐ฝ๐ฒ๐: works across 1D/2D/3D, mixed scalar & vector fields, and varying resolutions without re-architecting โ from simple time series to complex turbulent flows.
- ๐ฆ๐๐ฟ๐ผ๐ป๐ด ๐ฟ๐ฒ๐๐๐น๐๐, ๐น๐ฒ๐ฎ๐ป ๐๐๐ป๐ถ๐ป๐ด: beats from-scratch baselines and matches/surpasses recent PDE foundation models. LoRA retains most gains with far fewer trainable parameters.
Applications:
- General-purpose (task-agnostic) foundation model for PDEs.
- One model, multiple downstream tasks.
- Performs in data and compute-scare scenarios.
Model Variants:
- MORPH-FM-Ti: ~7M with 4 levels of finetuning including LoRA.
- MORPH-FM-S: ~30M with 4 levels of finetuning including LoRA.
- MORPH-FM-M: ~120M with 4 levels of finetuning including LoRA.
- MORPH-FM-L: ~500M with 4 levels of finetuning including LoRA.
- MORPH-FM-XL: ~1.2B (coming up soon).
- MORPH-SS-Ti: ~7M standalone models for 13 datasets.
- MORPH-SS-S: ~30M standalone models for 13 datasets.
Architecture:
- MORPH is built on a convolutional vision transformer backbone that seamlessly handles heterogeneous spatiotemporal datasets of varying data dimensionality (1Dโ3D) at different resolutions, multiple fields with mixed scalar and vector components.
- The architecture combines
- (i) component-wise convolution, which jointly processes scalar and vector channels to capture local interactions,
- (ii) inter-field cross-attention, which models and selectively propagates information between different physical fields,
- (iii) axial attentions, which factorizes full spatiotemporal self-attention along individual spatial and temporal axes to reduce computational burden while retaining expressivity.
If you use MORPH in your research, please cite:
@misc{rautela2025morphshapeagnosticpdefoundation,
title={{MORPH}: Shape-agnostic {PDE} Foundation Models},
author={Mahindra Singh Rautela and Alexander Most and Siddharth Mansingh and Bradley C. Love and Ayan Biswas and Diane Oyen and Earl Lawrence},
year={2025},
eprint={2509.21670},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.21670}
}