Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,94 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
# SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing
|
6 |
+
|
7 |
+
|
8 |
+
These are official weights for "SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing" — a self-supervised learning framework tailored for satellite imagery. SatDINO builds upon the **[DINO](https://github.com/facebookresearch/dino)** framework and adapts it to the unique remote sensing data.
|
9 |
+
|
10 |
+
[ **[Paper](https://arxiv.org/abs/2508.21402v1)** ], [ **[GitHub](https://github.com/strakaj/SatDINO)** ]
|
11 |
+
|
12 |
+
|
13 |
+
## Pretrained models
|
14 |
+
|
15 |
+
The models are pretrained on the RGB variant of the fMoW dataset and evaluated across multiple standard remote sensing benchmarks.
|
16 |
+
|
17 |
+
| arch | patch size | params. | GFLOPs | linear | hugging face | weights | weights-finetune |
|
18 |
+
|-----------|------------|---------|--------|--------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
|
19 |
+
| ViT-S | 16 | 21.59 | 8.54 | 72.75 | [strakajk/satdino-vit_small-16](https://huggingface.co/strakajk/satdino-vit_small-16) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16-finetune.pth) |
|
20 |
+
| ViT-S | 8 | 21.37 | 33.56 | 73.53 | [strakajk/satdino-vit_small-8](https://huggingface.co/strakajk/satdino-vit_small-8) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8-finetune.pth) |
|
21 |
+
| ViT-B | 16 | 85.65 | 33.90 | 73.52 | [strakajk/satdino-vit_base-16](https://huggingface.co/strakajk/satdino-vit_base-16) | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16-finetune.pth) |
|
22 |
+
|
23 |
+
|
24 |
+
### Create from HF
|
25 |
+
You can create a model using Hugging Face or from the official **[GitHub](https://github.com/strakaj/SatDINO)** repository.
|
26 |
+
|
27 |
+
```python
|
28 |
+
import torch
|
29 |
+
from transformers import AutoModel
|
30 |
+
|
31 |
+
model = AutoModel.from_pretrained("strakajk/satdino-vit_small-8", trust_remote_code=True)
|
32 |
+
model.eval()
|
33 |
+
|
34 |
+
# predict
|
35 |
+
x = torch.randn(1, 3, 224, 224)
|
36 |
+
y = model(x) # out: torch.Size([1, 384])
|
37 |
+
```
|
38 |
+
|
39 |
+
|
40 |
+
## Results
|
41 |
+
| Dataset | **SatDINO<sub>8</sub>** | **SatDINO<sub>16</sub>** | **Scale-MAE** | **SatMAE** |
|
42 |
+
|-----------|-----------------|--------------------|---------------|------------|
|
43 |
+
| EuroSAT | **87.72** | 85.96 | 85.42 | 81.43 |
|
44 |
+
| RESISC45 | **85.29** | 82.32 | 79.96 | 65.96 |
|
45 |
+
| UC Merced | **94.82** | 93.21 | 84.58 | 78.45 |
|
46 |
+
| WHU-RS19 | **98.18** | 97.82 | 89.32 | 86.41 |
|
47 |
+
| RS-C11 | **96.91** | 96.61 | 93.03 | 83.96 |
|
48 |
+
| SIRI-WHU | **91.82** | 87.19 | 84.84 | 77.76 |
|
49 |
+
|
50 |
+
Average kNN classification accuracy across multiple scales (12.5%, 25%, 50%, and 100%).
|
51 |
+
|
52 |
+
---
|
53 |
+
|
54 |
+
| **Dataset** | **Small<sub>16</sub>** | **Small<sub>8</sub>** | **Base** |
|
55 |
+
|-------------|------------------|---------------|---------------|
|
56 |
+
| EuroSAT | 98.69 | 98.76 | **98.83** |
|
57 |
+
| RESISC45 | 95.68 | 95.16 | **96.05** |
|
58 |
+
| UC Merced | 98.33 | **98.81** | 98.57 |
|
59 |
+
| WHU-RS19 | **98.54** | 98.06 | 97.57 |
|
60 |
+
| RS-C11 | **98.01** | 96.81 | 96.02 |
|
61 |
+
| SIRI-WHU | **98.54** | 97.08 | 97.08 |
|
62 |
+
|
63 |
+
SatDINO fine-tuning classification accuracy.
|
64 |
+
|
65 |
+
---
|
66 |
+
|
67 |
+
| **Model** | **Backbone** | **Potsdam 224<sup>2</sup>** | **Potsdam 512<sup>2</sup>** | **Vaihingen 224<sup>2</sup>** | **Vaihingen 512<sup>2</sup>** | **LoveDA 224<sup>2</sup>** | **LoveDA 512<sup>2</sup>** |
|
68 |
+
|-----------|------------------|---------------------|---------------------|-----------------------|-----------------------|--------------------|--------------------|
|
69 |
+
| SatMAE | ViT-Large | 67.88 | 70.39 | 64,81 | 69.13 | 46.28 | 52.28 |
|
70 |
+
| Scale-MAE | ViT-Large | 69.74 | **72.21** | 67.97 | **71.65** | **49.37** | **53.70** |
|
71 |
+
| SatDINO | ViT-Small<sub>16</sub> | 67.93 | 71.80 | 63.38 | 68.32 | 44.77 | 49.65 |
|
72 |
+
| SatDINO | ViT-Small<sub>8</sub> | **70.71** | 71.45 | **68.69** | 67.71 | 47.53 | 50.20 |
|
73 |
+
| SatDINO | ViT-Base | 67.65 | 71.63 | 64.85 | 69.37 | 44.25 | 50.08 |
|
74 |
+
|
75 |
+
Semantic segmentation performance across multiple datasets and image scales. All results are reported in terms of mean Intersection over Union (mIoU).
|
76 |
+
|
77 |
+
|
78 |
+
## License
|
79 |
+
This repository is released under the Apache 2.0 license as found in the LICENSE file.
|
80 |
+
|
81 |
+
|
82 |
+
## Citation
|
83 |
+
If you find this repository useful, please consider citing it:
|
84 |
+
```
|
85 |
+
@misc{straka2025satdinodeepdiveselfsupervised,
|
86 |
+
title={SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing},
|
87 |
+
author={Jakub Straka and Ivan Gruber},
|
88 |
+
year={2025},
|
89 |
+
eprint={2508.21402},
|
90 |
+
archivePrefix={arXiv},
|
91 |
+
primaryClass={cs.CV},
|
92 |
+
url={https://arxiv.org/abs/2508.21402},
|
93 |
+
}
|
94 |
+
```
|