File size: 6,617 Bytes
38f1681 3845122 94f46ca 3d8e9ad 3845122 31f1869 3d8e9ad 3845122 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
---
license: apache-2.0
datasets:
- jonathan-roberts1/RSI-CB256
language:
- en
base_model:
- google/siglip2-base-patch16-224
pipeline_tag: image-classification
library_name: transformers
tags:
- Location
- RSI
- Remote Sensing Instruments
---

# **RSI-CB256-35**
> **RSI-CB256-35** is a vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for **multi-class remote sensing image classification**. Built using the **SiglipForImageClassification** architecture, it is designed to accurately categorize overhead imagery into 35 distinct land-use and land-cover categories.
```py
Classification Report:
precision recall f1-score support
parking lot 0.9978 0.9872 0.9925 467
avenue 0.9927 1.0000 0.9963 544
highway 0.9283 0.9865 0.9565 223
bridge 0.9283 0.9659 0.9467 469
marina 0.9946 1.0000 0.9973 366
crossroads 0.9909 0.9801 0.9855 553
airport runway 0.9956 0.9926 0.9941 678
pipeline 0.9900 1.0000 0.9950 198
town 0.9970 1.0000 0.9985 335
airplane 0.9915 0.9915 0.9915 351
forest 0.9972 0.9945 0.9958 1082
mangrove 1.0000 1.0000 1.0000 1049
artificial grassland 0.9821 0.9717 0.9769 283
river protection forest 1.0000 1.0000 1.0000 524
shrubwood 1.0000 1.0000 1.0000 1331
sapling 0.9955 1.0000 0.9977 879
sparse forest 1.0000 1.0000 1.0000 1110
lakeshore 1.0000 1.0000 1.0000 438
river 0.9680 0.9555 0.9617 539
stream 1.0000 0.9971 0.9985 688
coastline 0.9913 0.9978 0.9946 459
hirst 0.9890 1.0000 0.9945 628
dam 0.9868 0.9259 0.9554 324
sea 0.9971 0.9864 0.9917 1028
snow mountain 1.0000 1.0000 1.0000 1153
sandbeach 0.9944 0.9907 0.9925 536
mountain 0.9926 0.9938 0.9932 812
desert 0.9757 0.9927 0.9841 1092
dry farm 1.0000 0.9992 0.9996 1309
green farmland 0.9984 0.9969 0.9977 644
bare land 0.9870 0.9630 0.9748 864
city building 0.9785 0.9892 0.9838 1014
residents 0.9926 0.9877 0.9901 810
container 0.9970 0.9955 0.9962 660
storage room 0.9985 1.0000 0.9992 1307
accuracy 0.9919 24747
macro avg 0.9894 0.9897 0.9895 24747
weighted avg 0.9920 0.9919 0.9919 24747
```
---
## **Label Space: 35 Remote Sensing Classes**
This model supports the classification of satellite or aerial images into the following classes:
```
Class 0: "parking lot"
Class 1: "avenue"
Class 2: "highway"
Class 3: "bridge"
Class 4: "marina"
Class 5: "crossroads"
Class 6: "airport runway"
Class 7: "pipeline"
Class 8: "town"
Class 9: "airplane"
Class 10: "forest"
Class 11: "mangrove"
Class 12: "artificial grassland"
Class 13: "river protection forest"
Class 14: "shrubwood"
Class 15: "sapling"
Class 16: "sparse forest"
Class 17: "lakeshore"
Class 18: "river"
Class 19: "stream"
Class 20: "coastline"
Class 21: "hirst"
Class 22: "dam"
Class 23: "sea"
Class 24: "snow mountain"
Class 25: "sandbeach"
Class 26: "mountain"
Class 27: "desert"
Class 28: "dry farm"
Class 29: "green farmland"
Class 30: "bare land"
Class 31: "city building"
Class 32: "residents"
Class 33: "container"
Class 34: "storage room"
```
---
## **Install Dependencies**
```bash
pip install -q transformers torch pillow gradio
```
---
## **Inference Code**
```python
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/RSI-CB256-35"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# ID to label mapping
id2label = {
"0": "parking lot",
"1": "avenue",
"2": "highway",
"3": "bridge",
"4": "marina",
"5": "crossroads",
"6": "airport runway",
"7": "pipeline",
"8": "town",
"9": "airplane",
"10": "forest",
"11": "mangrove",
"12": "artificial grassland",
"13": "river protection forest",
"14": "shrubwood",
"15": "sapling",
"16": "sparse forest",
"17": "lakeshore",
"18": "river",
"19": "stream",
"20": "coastline",
"21": "hirst",
"22": "dam",
"23": "sea",
"24": "snow mountain",
"25": "sandbeach",
"26": "mountain",
"27": "desert",
"28": "dry farm",
"29": "green farmland",
"30": "bare land",
"31": "city building",
"32": "residents",
"33": "container",
"34": "storage room"
}
def classify_rsi_image(image):
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
prediction = {
id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
}
return prediction
# Gradio Interface
iface = gr.Interface(
fn=classify_rsi_image,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=5, label="Top-5 Predicted Categories"),
title="RSI-CB256-35",
description="Remote sensing image classification using SigLIP2. Upload an aerial or satellite image to classify its land-use category."
)
if __name__ == "__main__":
iface.launch()
```
---
## **Intended Use**
* **Land-Use Mapping and Planning**
* **Environmental Monitoring**
* **Infrastructure Identification**
* **Remote Sensing Analytics**
* **Agricultural and Forest Area Classification** |