prithivMLmods commited on
Commit
adb71c3
·
verified ·
1 Parent(s): bb66a2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -1
README.md CHANGED
@@ -2,7 +2,22 @@
2
  license: apache-2.0
3
  datasets:
4
  - jonathan-roberts1/NWPU-RESISC45
 
 
 
 
 
 
 
 
 
5
  ---
 
 
 
 
 
 
6
 
7
  ```py
8
  Classification Report:
@@ -57,4 +72,135 @@ thermal power station 0.9482 0.9671 0.9576 700
57
  accuracy 0.9532 31500
58
  macro avg 0.9538 0.9532 0.9532 31500
59
  weighted avg 0.9538 0.9532 0.9532 31500
60
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - jonathan-roberts1/NWPU-RESISC45
5
+ language:
6
+ - en
7
+ base_model:
8
+ - google/siglip2-base-patch16-224
9
+ pipeline_tag: image-classification
10
+ library_name: transformers
11
+ tags:
12
+ - RESISC45
13
+ - SigLIP2
14
  ---
15
+
16
+ ![1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/_VFyr_efAG3NA1_GlHa87.png)
17
+
18
+ # **RESISC45-SigLIP2**
19
+
20
+ > **RESISC45-SigLIP2** is a vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for **multi-label** image classification. It is specifically trained to recognize and tag multiple land use and land cover scene categories from the **RESISC45** dataset using the **SiglipForImageClassification** architecture.
21
 
22
  ```py
23
  Classification Report:
 
72
  accuracy 0.9532 31500
73
  macro avg 0.9538 0.9532 0.9532 31500
74
  weighted avg 0.9538 0.9532 0.9532 31500
75
+ ```
76
+
77
+ ---
78
+
79
+ ## **Label Space: 45 Scene Categories**
80
+
81
+ The model predicts the presence of one or more of the following **45 scene categories**:
82
+
83
+ ```
84
+ Class 0: "airplane"
85
+ Class 1: "airport"
86
+ Class 2: "baseball diamond"
87
+ Class 3: "basketball court"
88
+ Class 4: "beach"
89
+ Class 5: "bridge"
90
+ Class 6: "chaparral"
91
+ Class 7: "church"
92
+ Class 8: "circular farmland"
93
+ Class 9: "cloud"
94
+ Class 10: "commercial area"
95
+ Class 11: "dense residential"
96
+ Class 12: "desert"
97
+ Class 13: "forest"
98
+ Class 14: "freeway"
99
+ Class 15: "golf course"
100
+ Class 16: "ground track field"
101
+ Class 17: "harbor"
102
+ Class 18: "industrial area"
103
+ Class 19: "intersection"
104
+ Class 20: "island"
105
+ Class 21: "lake"
106
+ Class 22: "meadow"
107
+ Class 23: "medium residential"
108
+ Class 24: "mobile home park"
109
+ Class 25: "mountain"
110
+ Class 26: "overpass"
111
+ Class 27: "palace"
112
+ Class 28: "parking lot"
113
+ Class 29: "railway"
114
+ Class 30: "railway station"
115
+ Class 31: "rectangular farmland"
116
+ Class 32: "river"
117
+ Class 33: "roundabout"
118
+ Class 34: "runway"
119
+ Class 35: "sea ice"
120
+ Class 36: "ship"
121
+ Class 37: "snowberg"
122
+ Class 38: "sparse residential"
123
+ Class 39: "stadium"
124
+ Class 40: "storage tank"
125
+ Class 41: "tennis court"
126
+ Class 42: "terrace"
127
+ Class 43: "thermal power station"
128
+ Class 44: "wetland"
129
+ ```
130
+
131
+ ---
132
+
133
+ ## **Install dependencies**
134
+
135
+ ```bash
136
+ pip install -q transformers torch pillow gradio
137
+ ```
138
+
139
+ ---
140
+
141
+ ## **Inference Code**
142
+
143
+ ```python
144
+ import gradio as gr
145
+ from transformers import AutoImageProcessor, SiglipForImageClassification
146
+ from PIL import Image
147
+ import torch
148
+
149
+ # Load model and processor
150
+ model_name = "prithivMLmods/RESISC45-SigLIP2" # Update to your actual Hugging Face model path
151
+ model = SiglipForImageClassification.from_pretrained(model_name)
152
+ processor = AutoImageProcessor.from_pretrained(model_name)
153
+
154
+ # Label map
155
+ id2label = {
156
+ "0": "airplane", "1": "airport", "2": "baseball diamond", "3": "basketball court", "4": "beach",
157
+ "5": "bridge", "6": "chaparral", "7": "church", "8": "circular farmland", "9": "cloud",
158
+ "10": "commercial area", "11": "dense residential", "12": "desert", "13": "forest", "14": "freeway",
159
+ "15": "golf course", "16": "ground track field", "17": "harbor", "18": "industrial area", "19": "intersection",
160
+ "20": "island", "21": "lake", "22": "meadow", "23": "medium residential", "24": "mobile home park",
161
+ "25": "mountain", "26": "overpass", "27": "palace", "28": "parking lot", "29": "railway",
162
+ "30": "railway station", "31": "rectangular farmland", "32": "river", "33": "roundabout", "34": "runway",
163
+ "35": "sea ice", "36": "ship", "37": "snowberg", "38": "sparse residential", "39": "stadium",
164
+ "40": "storage tank", "41": "tennis court", "42": "terrace", "43": "thermal power station", "44": "wetland"
165
+ }
166
+
167
+ def classify_resisc_image(image):
168
+ image = Image.fromarray(image).convert("RGB")
169
+ inputs = processor(images=image, return_tensors="pt")
170
+
171
+ with torch.no_grad():
172
+ outputs = model(**inputs)
173
+ logits = outputs.logits
174
+ probs = torch.sigmoid(logits).squeeze().tolist()
175
+
176
+ threshold = 0.5
177
+ predictions = {
178
+ id2label[str(i)]: round(probs[i], 3)
179
+ for i in range(len(probs)) if probs[i] >= threshold
180
+ }
181
+
182
+ return predictions or {"None Detected": 0.0}
183
+
184
+ # Gradio Interface
185
+ iface = gr.Interface(
186
+ fn=classify_resisc_image,
187
+ inputs=gr.Image(type="numpy"),
188
+ outputs=gr.Label(label="Predicted Scene Categories"),
189
+ title="RESISC45-SigLIP2",
190
+ description="Upload a satellite image to detect multiple land use and land cover categories (e.g., airport, forest, mountain)."
191
+ )
192
+
193
+ if __name__ == "__main__":
194
+ iface.launch()
195
+ ```
196
+
197
+ ---
198
+
199
+ ## **Intended Use**
200
+
201
+ The **RESISC45-SigLIP2** model is ideal for multi-label classification tasks involving remote sensing imagery. Use cases include:
202
+
203
+ - **Remote Sensing Analysis** – Label elements in aerial/satellite images.
204
+ - **Urban Planning** – Identify urban structures and landscape features.
205
+ - **Geospatial Intelligence** – Aid in automated image interpretation pipelines.
206
+ - **Environmental Monitoring** – Track natural landforms and changes.