--- license: apache-2.0 datasets: - jonathan-roberts1/RSI-CB256 language: - en base_model: - google/siglip2-base-patch16-224 pipeline_tag: image-classification library_name: transformers tags: - Location - RSI - Remote Sensing Instruments --- ![3.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/QJkZkY5d6EwB4P5xOrgfY.png) # **RSI-CB256-35** > **RSI-CB256-35** is a vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for **multi-class remote sensing image classification**. Built using the **SiglipForImageClassification** architecture, it is designed to accurately categorize overhead imagery into 35 distinct land-use and land-cover categories. ```py Classification Report: precision recall f1-score support parking lot 0.9978 0.9872 0.9925 467 avenue 0.9927 1.0000 0.9963 544 highway 0.9283 0.9865 0.9565 223 bridge 0.9283 0.9659 0.9467 469 marina 0.9946 1.0000 0.9973 366 crossroads 0.9909 0.9801 0.9855 553 airport runway 0.9956 0.9926 0.9941 678 pipeline 0.9900 1.0000 0.9950 198 town 0.9970 1.0000 0.9985 335 airplane 0.9915 0.9915 0.9915 351 forest 0.9972 0.9945 0.9958 1082 mangrove 1.0000 1.0000 1.0000 1049 artificial grassland 0.9821 0.9717 0.9769 283 river protection forest 1.0000 1.0000 1.0000 524 shrubwood 1.0000 1.0000 1.0000 1331 sapling 0.9955 1.0000 0.9977 879 sparse forest 1.0000 1.0000 1.0000 1110 lakeshore 1.0000 1.0000 1.0000 438 river 0.9680 0.9555 0.9617 539 stream 1.0000 0.9971 0.9985 688 coastline 0.9913 0.9978 0.9946 459 hirst 0.9890 1.0000 0.9945 628 dam 0.9868 0.9259 0.9554 324 sea 0.9971 0.9864 0.9917 1028 snow mountain 1.0000 1.0000 1.0000 1153 sandbeach 0.9944 0.9907 0.9925 536 mountain 0.9926 0.9938 0.9932 812 desert 0.9757 0.9927 0.9841 1092 dry farm 1.0000 0.9992 0.9996 1309 green farmland 0.9984 0.9969 0.9977 644 bare land 0.9870 0.9630 0.9748 864 city building 0.9785 0.9892 0.9838 1014 residents 0.9926 0.9877 0.9901 810 container 0.9970 0.9955 0.9962 660 storage room 0.9985 1.0000 0.9992 1307 accuracy 0.9919 24747 macro avg 0.9894 0.9897 0.9895 24747 weighted avg 0.9920 0.9919 0.9919 24747 ``` --- ## **Label Space: 35 Remote Sensing Classes** This model supports the classification of satellite or aerial images into the following classes: ``` Class 0: "parking lot" Class 1: "avenue" Class 2: "highway" Class 3: "bridge" Class 4: "marina" Class 5: "crossroads" Class 6: "airport runway" Class 7: "pipeline" Class 8: "town" Class 9: "airplane" Class 10: "forest" Class 11: "mangrove" Class 12: "artificial grassland" Class 13: "river protection forest" Class 14: "shrubwood" Class 15: "sapling" Class 16: "sparse forest" Class 17: "lakeshore" Class 18: "river" Class 19: "stream" Class 20: "coastline" Class 21: "hirst" Class 22: "dam" Class 23: "sea" Class 24: "snow mountain" Class 25: "sandbeach" Class 26: "mountain" Class 27: "desert" Class 28: "dry farm" Class 29: "green farmland" Class 30: "bare land" Class 31: "city building" Class 32: "residents" Class 33: "container" Class 34: "storage room" ``` --- ## **Install Dependencies** ```bash pip install -q transformers torch pillow gradio ``` --- ## **Inference Code** ```python import gradio as gr from transformers import AutoImageProcessor, SiglipForImageClassification from PIL import Image import torch # Load model and processor model_name = "prithivMLmods/RSI-CB256-35" model = SiglipForImageClassification.from_pretrained(model_name) processor = AutoImageProcessor.from_pretrained(model_name) # ID to label mapping id2label = { "0": "parking lot", "1": "avenue", "2": "highway", "3": "bridge", "4": "marina", "5": "crossroads", "6": "airport runway", "7": "pipeline", "8": "town", "9": "airplane", "10": "forest", "11": "mangrove", "12": "artificial grassland", "13": "river protection forest", "14": "shrubwood", "15": "sapling", "16": "sparse forest", "17": "lakeshore", "18": "river", "19": "stream", "20": "coastline", "21": "hirst", "22": "dam", "23": "sea", "24": "snow mountain", "25": "sandbeach", "26": "mountain", "27": "desert", "28": "dry farm", "29": "green farmland", "30": "bare land", "31": "city building", "32": "residents", "33": "container", "34": "storage room" } def classify_rsi_image(image): image = Image.fromarray(image).convert("RGB") inputs = processor(images=image, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist() prediction = { id2label[str(i)]: round(probs[i], 3) for i in range(len(probs)) } return prediction # Gradio Interface iface = gr.Interface( fn=classify_rsi_image, inputs=gr.Image(type="numpy"), outputs=gr.Label(num_top_classes=5, label="Top-5 Predicted Categories"), title="RSI-CB256-35", description="Remote sensing image classification using SigLIP2. Upload an aerial or satellite image to classify its land-use category." ) if __name__ == "__main__": iface.launch() ``` --- ## **Intended Use** * **Land-Use Mapping and Planning** * **Environmental Monitoring** * **Infrastructure Identification** * **Remote Sensing Analytics** * **Agricultural and Forest Area Classification**