strakajk commited on
Commit
f628a41
·
verified ·
1 Parent(s): 0a55d3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -3
README.md CHANGED
@@ -1,3 +1,94 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing
6
+
7
+
8
+ These are official weights for "SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing" — a self-supervised learning framework tailored for satellite imagery. SatDINO builds upon the **[DINO](https://github.com/facebookresearch/dino)** framework and adapts it to the unique remote sensing data.
9
+
10
+ [ **[Paper](https://arxiv.org/abs/2508.21402v1)** ], [ **[GitHub](https://github.com/strakaj/SatDINO)** ]
11
+
12
+
13
+ ## Pretrained models
14
+
15
+ The models are pretrained on the RGB variant of the fMoW dataset and evaluated across multiple standard remote sensing benchmarks.
16
+
17
+ | arch | patch size | params. | GFLOPs | linear | hugging face | weights | weights-finetune |
18
+ |-----------|------------|---------|--------|--------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
19
+ | ViT-S | 16 | 21.59 | 8.54 | 72.75 | [strakajk/satdino-vit_small-16](https://huggingface.co/strakajk/satdino-vit_small-16) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-16/resolve/main/satdino-vit_small-16-finetune.pth) |
20
+ | ViT-S | 8 | 21.37 | 33.56 | 73.53 | [strakajk/satdino-vit_small-8](https://huggingface.co/strakajk/satdino-vit_small-8) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_small-8/resolve/main/satdino-vit_small-8-finetune.pth) |
21
+ | ViT-B | 16 | 85.65 | 33.90 | 73.52 | [strakajk/satdino-vit_base-16](https://huggingface.co/strakajk/satdino-vit_base-16) | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16.pth) | [ckp](https://huggingface.co/strakajk/satdino-vit_base-16/resolve/main/satdino-vit_base-16-finetune.pth) |
22
+
23
+
24
+ ### Create from HF
25
+ You can create a model using Hugging Face or from the official **[GitHub](https://github.com/strakaj/SatDINO)** repository.
26
+
27
+ ```python
28
+ import torch
29
+ from transformers import AutoModel
30
+
31
+ model = AutoModel.from_pretrained("strakajk/satdino-vit_small-8", trust_remote_code=True)
32
+ model.eval()
33
+
34
+ # predict
35
+ x = torch.randn(1, 3, 224, 224)
36
+ y = model(x) # out: torch.Size([1, 384])
37
+ ```
38
+
39
+
40
+ ## Results
41
+ | Dataset | **SatDINO<sub>8</sub>** | **SatDINO<sub>16</sub>** | **Scale-MAE** | **SatMAE** |
42
+ |-----------|-----------------|--------------------|---------------|------------|
43
+ | EuroSAT | **87.72** | 85.96 | 85.42 | 81.43 |
44
+ | RESISC45 | **85.29** | 82.32 | 79.96 | 65.96 |
45
+ | UC Merced | **94.82** | 93.21 | 84.58 | 78.45 |
46
+ | WHU-RS19 | **98.18** | 97.82 | 89.32 | 86.41 |
47
+ | RS-C11 | **96.91** | 96.61 | 93.03 | 83.96 |
48
+ | SIRI-WHU | **91.82** | 87.19 | 84.84 | 77.76 |
49
+
50
+ Average kNN classification accuracy across multiple scales (12.5%, 25%, 50%, and 100%).
51
+
52
+ ---
53
+
54
+ | **Dataset** | **Small<sub>16</sub>** | **Small<sub>8</sub>** | **Base** |
55
+ |-------------|------------------|---------------|---------------|
56
+ | EuroSAT | 98.69 | 98.76 | **98.83** |
57
+ | RESISC45 | 95.68 | 95.16 | **96.05** |
58
+ | UC Merced | 98.33 | **98.81** | 98.57 |
59
+ | WHU-RS19 | **98.54** | 98.06 | 97.57 |
60
+ | RS-C11 | **98.01** | 96.81 | 96.02 |
61
+ | SIRI-WHU | **98.54** | 97.08 | 97.08 |
62
+
63
+ SatDINO fine-tuning classification accuracy.
64
+
65
+ ---
66
+
67
+ | **Model** | **Backbone** | **Potsdam 224<sup>2</sup>** | **Potsdam 512<sup>2</sup>** | **Vaihingen 224<sup>2</sup>** | **Vaihingen 512<sup>2</sup>** | **LoveDA 224<sup>2</sup>** | **LoveDA 512<sup>2</sup>** |
68
+ |-----------|------------------|---------------------|---------------------|-----------------------|-----------------------|--------------------|--------------------|
69
+ | SatMAE | ViT-Large | 67.88 | 70.39 | 64,81 | 69.13 | 46.28 | 52.28 |
70
+ | Scale-MAE | ViT-Large | 69.74 | **72.21** | 67.97 | **71.65** | **49.37** | **53.70** |
71
+ | SatDINO | ViT-Small<sub>16</sub> | 67.93 | 71.80 | 63.38 | 68.32 | 44.77 | 49.65 |
72
+ | SatDINO | ViT-Small<sub>8</sub> | **70.71** | 71.45 | **68.69** | 67.71 | 47.53 | 50.20 |
73
+ | SatDINO | ViT-Base | 67.65 | 71.63 | 64.85 | 69.37 | 44.25 | 50.08 |
74
+
75
+ Semantic segmentation performance across multiple datasets and image scales. All results are reported in terms of mean Intersection over Union (mIoU).
76
+
77
+
78
+ ## License
79
+ This repository is released under the Apache 2.0 license as found in the LICENSE file.
80
+
81
+
82
+ ## Citation
83
+ If you find this repository useful, please consider citing it:
84
+ ```
85
+ @misc{straka2025satdinodeepdiveselfsupervised,
86
+ title={SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing},
87
+ author={Jakub Straka and Ivan Gruber},
88
+ year={2025},
89
+ eprint={2508.21402},
90
+ archivePrefix={arXiv},
91
+ primaryClass={cs.CV},
92
+ url={https://arxiv.org/abs/2508.21402},
93
+ }
94
+ ```