prithivMLmods
/

RESISC45-SigLIP2

@@ -2,7 +2,22 @@
 license: apache-2.0
 datasets:
 - jonathan-roberts1/NWPU-RESISC45
 ---
 ```py
 Classification Report:
@@ -57,4 +72,135 @@ thermal power station     0.9482    0.9671    0.9576       700
              accuracy                         0.9532     31500
             macro avg     0.9538    0.9532    0.9532     31500
          weighted avg     0.9538    0.9532    0.9532     31500
-```

 license: apache-2.0
 datasets:
 - jonathan-roberts1/NWPU-RESISC45
+language:
+- en
+base_model:
+- google/siglip2-base-patch16-224
+pipeline_tag: image-classification
+library_name: transformers
+tags:
+- RESISC45
+- SigLIP2
 ---
+![1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/_VFyr_efAG3NA1_GlHa87.png)
+# **RESISC45-SigLIP2**
+> **RESISC45-SigLIP2** is a vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for **multi-label** image classification. It is specifically trained to recognize and tag multiple land use and land cover scene categories from the **RESISC45** dataset using the **SiglipForImageClassification** architecture.
 ```py
 Classification Report:
              accuracy                         0.9532     31500
             macro avg     0.9538    0.9532    0.9532     31500
          weighted avg     0.9538    0.9532    0.9532     31500
+```
+---
+## **Label Space: 45 Scene Categories**
+The model predicts the presence of one or more of the following **45 scene categories**:
+```
+Class 0: "airplane"
+Class 1: "airport"
+Class 2: "baseball diamond"
+Class 3: "basketball court"
+Class 4: "beach"
+Class 5: "bridge"
+Class 6: "chaparral"
+Class 7: "church"
+Class 8: "circular farmland"
+Class 9: "cloud"
+Class 10: "commercial area"
+Class 11: "dense residential"
+Class 12: "desert"
+Class 13: "forest"
+Class 14: "freeway"
+Class 15: "golf course"
+Class 16: "ground track field"
+Class 17: "harbor"
+Class 18: "industrial area"
+Class 19: "intersection"
+Class 20: "island"
+Class 21: "lake"
+Class 22: "meadow"
+Class 23: "medium residential"
+Class 24: "mobile home park"
+Class 25: "mountain"
+Class 26: "overpass"
+Class 27: "palace"
+Class 28: "parking lot"
+Class 29: "railway"
+Class 30: "railway station"
+Class 31: "rectangular farmland"
+Class 32: "river"
+Class 33: "roundabout"
+Class 34: "runway"
+Class 35: "sea ice"
+Class 36: "ship"
+Class 37: "snowberg"
+Class 38: "sparse residential"
+Class 39: "stadium"
+Class 40: "storage tank"
+Class 41: "tennis court"
+Class 42: "terrace"
+Class 43: "thermal power station"
+Class 44: "wetland"
+```
+---
+## **Install dependencies**
+```bash
+pip install -q transformers torch pillow gradio
+```
+---
+## **Inference Code**
+```python
+import gradio as gr
+from transformers import AutoImageProcessor, SiglipForImageClassification
+from PIL import Image
+import torch
+# Load model and processor
+model_name = "prithivMLmods/RESISC45-SigLIP2"  # Update to your actual Hugging Face model path
+model = SiglipForImageClassification.from_pretrained(model_name)
+processor = AutoImageProcessor.from_pretrained(model_name)
+# Label map
+id2label = {
+    "0": "airplane", "1": "airport", "2": "baseball diamond", "3": "basketball court", "4": "beach",
+    "5": "bridge", "6": "chaparral", "7": "church", "8": "circular farmland", "9": "cloud",
+    "10": "commercial area", "11": "dense residential", "12": "desert", "13": "forest", "14": "freeway",
+    "15": "golf course", "16": "ground track field", "17": "harbor", "18": "industrial area", "19": "intersection",
+    "20": "island", "21": "lake", "22": "meadow", "23": "medium residential", "24": "mobile home park",
+    "25": "mountain", "26": "overpass", "27": "palace", "28": "parking lot", "29": "railway",
+    "30": "railway station", "31": "rectangular farmland", "32": "river", "33": "roundabout", "34": "runway",
+    "35": "sea ice", "36": "ship", "37": "snowberg", "38": "sparse residential", "39": "stadium",
+    "40": "storage tank", "41": "tennis court", "42": "terrace", "43": "thermal power station", "44": "wetland"
+}
+def classify_resisc_image(image):
+    image = Image.fromarray(image).convert("RGB")
+    inputs = processor(images=image, return_tensors="pt")
+    with torch.no_grad():
+        outputs = model(**inputs)
+        logits = outputs.logits
+        probs = torch.sigmoid(logits).squeeze().tolist()
+    threshold = 0.5
+    predictions = {
+        id2label[str(i)]: round(probs[i], 3)
+        for i in range(len(probs)) if probs[i] >= threshold
+    }
+    return predictions or {"None Detected": 0.0}
+# Gradio Interface
+iface = gr.Interface(
+    fn=classify_resisc_image,
+    inputs=gr.Image(type="numpy"),
+    outputs=gr.Label(label="Predicted Scene Categories"),
+    title="RESISC45-SigLIP2",
+    description="Upload a satellite image to detect multiple land use and land cover categories (e.g., airport, forest, mountain)."
+)
+if __name__ == "__main__":
+    iface.launch()
+```
+---
+## **Intended Use**
+The **RESISC45-SigLIP2** model is ideal for multi-label classification tasks involving remote sensing imagery. Use cases include:
+- **Remote Sensing Analysis** – Label elements in aerial/satellite images.
+- **Urban Planning** – Identify urban structures and landscape features.
+- **Geospatial Intelligence** – Aid in automated image interpretation pipelines.
+- **Environmental Monitoring** – Track natural landforms and changes.