ax.png

imagenet-50-subset

imagenet-50-subset is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for multi-class image classification. It is trained to classify images into a subset of 50 categories derived from the ImageNet dataset using the SiglipForImageClassification architecture.

Classification Report:
                          precision    recall  f1-score   support

                   tench     0.9878    0.9911    0.9895       900
                goldfish     0.9945    0.9956    0.9950       900
       great white shark     0.9339    0.8944    0.9137       900
             tiger shark     0.8957    0.8967    0.8962       900
              hammerhead     0.9300    0.9589    0.9442       900
            electric ray     0.8788    0.8622    0.8704       900
                stingray     0.8689    0.8911    0.8799       900
                    cock     0.9000    0.9200    0.9099       900
                     hen     0.9162    0.8867    0.9012       900
                 ostrich     0.9945    0.9989    0.9967       900
               brambling     0.9671    0.9478    0.9574       900
               goldfinch     0.9867    0.9911    0.9889       900
             house finch     0.9629    0.9811    0.9719       900
                   junco     0.9583    0.9700    0.9641       900
          indigo bunting     0.9933    0.9911    0.9922       900
                   robin     0.9888    0.9811    0.9849       900
                  bulbul     0.9735    0.9811    0.9773       900
                     jay     0.9855    0.9789    0.9822       900
                  magpie     0.9776    0.9700    0.9738       900
               chickadee     0.9834    0.9844    0.9839       900
             water ouzel     0.9680    0.9744    0.9712       900
                    kite     0.9512    0.9522    0.9517       900
              bald eagle     0.9843    0.9722    0.9782       900
                 vulture     0.9562    0.9700    0.9630       900
          great grey owl     0.9989    0.9944    0.9967       900
european fire salamander     0.9330    0.9278    0.9304       900
             common newt     0.7969    0.7933    0.7951       900
                     eft     0.9162    0.8989    0.9075       900
      spotted salamander     0.9249    0.9300    0.9274       900
                 axolotl     0.9888    0.9767    0.9827       900
                bullfrog     0.9116    0.9167    0.9141       900
               tree frog     0.9108    0.9533    0.9316       900
             tailed frog     0.8658    0.8100    0.8370       900
              loggerhead     0.8657    0.8956    0.8804       900
      leatherback turtle     0.9038    0.8667    0.8849       900
              mud turtle     0.7980    0.7111    0.7521       900
                terrapin     0.7039    0.7844    0.7420       900
              box turtle     0.8576    0.8633    0.8605       900
            banded gecko     0.9255    0.9111    0.9183       900
           common iguana     0.9033    0.9133    0.9083       900
      american chameleon     0.6577    0.7622    0.7061       900
                whiptail     0.8351    0.8722    0.8533       900
                   agama     0.9010    0.8900    0.8955       900
          frilled lizard     0.9674    0.9233    0.9449       900
        alligator lizard     0.8862    0.8822    0.8842       900
            gila monster     0.9821    0.9733    0.9777       900
            green lizard     0.6574    0.5756    0.6137       900
       african chameleon     0.9573    0.9711    0.9641       900
           komodo dragon     0.9693    0.9811    0.9752       900
       african crocodile     0.9769    0.9878    0.9823       900

                accuracy                         0.9181     45000
               macro avg     0.9186    0.9181    0.9181     45000
            weighted avg     0.9186    0.9181    0.9181     45000

Label Space: 50 Classes

The model classifies each image into one of the following categories:

0: tench
1: goldfish
2: great white shark
3: tiger shark
4: hammerhead
5: electric ray
6: stingray
7: cock
8: hen
9: ostrich
10: brambling
11: goldfinch
12: house finch
13: junco
14: indigo bunting
15: robin
16: bulbul
17: jay
18: magpie
19: chickadee
20: water ouzel
21: kite
22: bald eagle
23: vulture
24: great grey owl
25: european fire salamander
26: common newt
27: eft
28: spotted salamander
29: axolotl
30: bullfrog
31: tree frog
32: tailed frog
33: loggerhead
34: leatherback turtle
35: mud turtle
36: terrapin
37: box turtle
38: banded gecko
39: common iguana
40: american chameleon
41: whiptail
42: agama
43: frilled lizard
44: alligator lizard
45: gila monster
46: green lizard
47: african chameleon
48: komodo dragon
49: african crocodile

Install Dependencies

pip install -q transformers torch pillow gradio

Inference Code

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/imagenet-50-subset"  # Replace if different
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

# Label mapping
id2label = {
    "0": "tench",
    "1": "goldfish",
    "2": "great white shark",
    "3": "tiger shark",
    "4": "hammerhead",
    "5": "electric ray",
    "6": "stingray",
    "7": "cock",
    "8": "hen",
    "9": "ostrich",
    "10": "brambling",
    "11": "goldfinch",
    "12": "house finch",
    "13": "junco",
    "14": "indigo bunting",
    "15": "robin",
    "16": "bulbul",
    "17": "jay",
    "18": "magpie",
    "19": "chickadee",
    "20": "water ouzel",
    "21": "kite",
    "22": "bald eagle",
    "23": "vulture",
    "24": "great grey owl",
    "25": "european fire salamander",
    "26": "common newt",
    "27": "eft",
    "28": "spotted salamander",
    "29": "axolotl",
    "30": "bullfrog",
    "31": "tree frog",
    "32": "tailed frog",
    "33": "loggerhead",
    "34": "leatherback turtle",
    "35": "mud turtle",
    "36": "terrapin",
    "37": "box turtle",
    "38": "banded gecko",
    "39": "common iguana",
    "40": "american chameleon",
    "41": "whiptail",
    "42": "agama",
    "43": "frilled lizard",
    "44": "alligator lizard",
    "45": "gila monster",
    "46": "green lizard",
    "47": "african chameleon",
    "48": "komodo dragon",
    "49": "african crocodile"
}

def classify_imagenet_50(image):
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    prediction = {
        id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
    }

    return prediction

# Gradio Interface
iface = gr.Interface(
    fn=classify_imagenet_50,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(num_top_classes=5, label="ImageNet-50 Classification"),
    title="imagenet-50-subset",
    description="Upload an image to classify it into one of 50 selected ImageNet categories."
)

if __name__ == "__main__":
    iface.launch()

Intended Use

imagenet-50-subset can be used for:

  • Benchmarking Lightweight Vision Models โ€“ Quick testing on a meaningful subset of ImageNet classes.
  • Educational Demos โ€“ Teaching about classification tasks with a simpler label space.
  • Prototype Deployment โ€“ Use in applications where full ImageNet coverage is unnecessary.
  • Dataset Analysis โ€“ Classification-based filtering of visual content into known object classes.
Downloads last month
7
Safetensors
Model size
92.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/imagenet-50-subset

Finetuned
(98)
this model

Dataset used to train prithivMLmods/imagenet-50-subset

Collection including prithivMLmods/imagenet-50-subset