strakajk
/

satdino-vit_small-8

+---
+license: apache-2.0
+---
+# SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing
+These are official weights for "SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing" — a self-supervised learning framework tailored for satellite imagery. SatDINO builds upon the **[DINO](https://github.com/facebookresearch/dino)** framework and adapts it to the unique remote sensing data.
+[ **[Paper](https://arxiv.org/abs/2508.21402v1)** ], [ **[GitHub](https://github.com/strakaj/SatDINO)** ]
+## Pretrained models
+The models are pretrained on the RGB variant of the fMoW dataset and evaluated across multiple standard remote sensing benchmarks.
+| arch      | patch size | params. | GFLOPs | linear | hugging face                                                                          | weights                                                                                           | weights-finetune                                                                                           |
+|-----------|------------|---------|--------|--------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
+| ViT-S | 16         | 21.59   | 8.54   | 72.75  | [strakajk/satdino-vit_small-16](https://huggingface.co/strakajk/satdino-vit_small-16) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16-finetune.pth) |
+| ViT-S | 8          | 21.37   | 33.56  | 73.53  | [strakajk/satdino-vit_small-8](https://huggingface.co/strakajk/satdino-vit_small-8)   | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8.pth)   | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8-finetune.pth)   |
+| ViT-B  | 16         | 85.65   | 33.90  | 73.52  | [strakajk/satdino-vit_base-16](https://huggingface.co/strakajk/satdino-vit_base-16)   | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16.pth)   | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16-finetune.pth)   |
+### Create from HF
+You can create a model using Hugging Face or from the official **[GitHub](https://github.com/strakaj/SatDINO)** repository.
+```python
+import torch
+from transformers import AutoModel
+model = AutoModel.from_pretrained("strakajk/satdino-vit_small-8", trust_remote_code=True)
+model.eval()
+# predict
+x = torch.randn(1, 3, 224, 224)
+y = model(x)   # out: torch.Size([1, 384])
+```
+## Results
+| Dataset   | **SatDINO<sub>8</sub>** | **SatDINO<sub>16</sub>** | **Scale-MAE** | **SatMAE** |
+|-----------|-----------------|--------------------|---------------|------------|
+| EuroSAT   | **87.72**       | 85.96              | 85.42         | 81.43      |
+| RESISC45  | **85.29**       | 82.32              | 79.96         | 65.96      |
+| UC Merced | **94.82**       | 93.21              | 84.58         | 78.45      |
+| WHU-RS19  | **98.18**       | 97.82              | 89.32         | 86.41      |
+| RS-C11    | **96.91**       | 96.61              | 93.03         | 83.96      |
+| SIRI-WHU  | **91.82**       | 87.19              | 84.84         | 77.76      |
+Average kNN classification accuracy across multiple scales (12.5%, 25%, 50%, and 100%).
+---
+| **Dataset** | **Small<sub>16</sub>** | **Small<sub>8</sub>** | **Base**      |
+|-------------|------------------|---------------|---------------|
+| EuroSAT     | 98.69            | 98.76         | **98.83**     |
+| RESISC45    | 95.68            | 95.16         | **96.05**     |
+| UC Merced   | 98.33            | **98.81**     | 98.57         |
+| WHU-RS19    | **98.54**        | 98.06         | 97.57         |
+| RS-C11      | **98.01**        | 96.81         | 96.02         |
+| SIRI-WHU    | **98.54**        | 97.08         | 97.08         |
+SatDINO fine-tuning classification accuracy.
+---
+| **Model** | **Backbone**     | **Potsdam 224<sup>2</sup>** | **Potsdam 512<sup>2</sup>** | **Vaihingen 224<sup>2</sup>** | **Vaihingen 512<sup>2</sup>** | **LoveDA 224<sup>2</sup>** | **LoveDA 512<sup>2</sup>** |
+|-----------|------------------|---------------------|---------------------|-----------------------|-----------------------|--------------------|--------------------|
+| SatMAE    | ViT-Large        | 67.88               | 70.39               | 64,81                 | 69.13                 | 46.28              | 52.28              |
+| Scale-MAE | ViT-Large        | 69.74               | **72.21**           | 67.97                 | **71.65**             | **49.37**          | **53.70**          |
+| SatDINO   | ViT-Small<sub>16</sub> | 67.93               | 71.80               | 63.38                 | 68.32                 | 44.77              | 49.65              |
+| SatDINO   | ViT-Small<sub>8</sub>    | **70.71**           | 71.45               | **68.69**             | 67.71                 | 47.53              | 50.20              |
+| SatDINO   | ViT-Base         | 67.65               | 71.63               | 64.85                 | 69.37                 | 44.25              | 50.08              |
+Semantic segmentation performance across multiple datasets and image scales. All results are reported in terms of mean Intersection over Union (mIoU).
+## License
+This repository is released under the Apache 2.0 license as found in the LICENSE file.
+## Citation
+If you find this repository useful, please consider citing it:
+```
+@misc{straka2025satdinodeepdiveselfsupervised,
+      title={SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing},
+      author={Jakub Straka and Ivan Gruber},
+      year={2025},
+      eprint={2508.21402},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2508.21402},
+}
+```