--- license: apache-2.0 datasets: - cj-mills/hagrid-classification-512p-no-gesture-150k language: - en base_model: - google/siglip2-so400m-patch14-384 pipeline_tag: image-classification library_name: transformers tags: - Gesture - Classification - SigLIP2 - 19:Styles - Vision-Encoder --- ![15.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/JBoqEwRBoOQwik0aRYeGw.png) # **Hand-Gesture-19** > **Hand-Gesture-19** is an image classification vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for a single-label classification task. It is designed to classify hand gesture images into different categories using the **SiglipForImageClassification** architecture. ```py Classification Report: precision recall f1-score support call 0.9889 0.9739 0.9813 6939 dislike 0.9892 0.9863 0.9877 7028 fist 0.9956 0.9923 0.9940 6882 four 0.9632 0.9653 0.9643 7183 like 0.9668 0.9855 0.9760 6823 mute 0.9848 0.9976 0.9912 7139 no_gesture 0.9960 0.9957 0.9958 27823 ok 0.9872 0.9831 0.9852 6924 one 0.9817 0.9854 0.9835 7062 palm 0.9793 0.9848 0.9820 7050 peace 0.9723 0.9635 0.9679 6965 peace_inverted 0.9806 0.9836 0.9821 6876 rock 0.9853 0.9865 0.9859 6883 stop 0.9614 0.9901 0.9756 6893 stop_inverted 0.9933 0.9712 0.9821 7142 three 0.9712 0.9478 0.9594 6940 three2 0.9785 0.9799 0.9792 6870 two_up 0.9848 0.9863 0.9855 7346 two_up_inverted 0.9855 0.9871 0.9863 6967 accuracy 0.9833 153735 macro avg 0.9813 0.9814 0.9813 153735 weighted avg 0.9833 0.9833 0.9833 153735 ``` ![download (2).png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/BhwQi6V5Qzl3g33OvRsWz.png) The model categorizes images into nineteen hand gestures: - **Class 0:** "call" - **Class 1:** "dislike" - **Class 2:** "fist" - **Class 3:** "four" - **Class 4:** "like" - **Class 5:** "mute" - **Class 6:** "no_gesture" - **Class 7:** "ok" - **Class 8:** "one" - **Class 9:** "palm" - **Class 10:** "peace" - **Class 11:** "peace_inverted" - **Class 12:** "rock" - **Class 13:** "stop" - **Class 14:** "stop_inverted" - **Class 15:** "three" - **Class 16:** "three2" - **Class 17:** "two_up" - **Class 18:** "two_up_inverted" # **Run with Transformers🤗** ```python !pip install -q transformers torch pillow gradio ``` ```python import gradio as gr from transformers import AutoImageProcessor from transformers import SiglipForImageClassification from transformers.image_utils import load_image from PIL import Image import torch # Load model and processor model_name = "prithivMLmods/Hand-Gesture-19" model = SiglipForImageClassification.from_pretrained(model_name) processor = AutoImageProcessor.from_pretrained(model_name) def hand_gesture_classification(image): """Predicts the hand gesture category from an image.""" image = Image.fromarray(image).convert("RGB") inputs = processor(images=image, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist() labels = { "0": "call", "1": "dislike", "2": "fist", "3": "four", "4": "like", "5": "mute", "6": "no_gesture", "7": "ok", "8": "one", "9": "palm", "10": "peace", "11": "peace_inverted", "12": "rock", "13": "stop", "14": "stop_inverted", "15": "three", "16": "three2", "17": "two_up", "18": "two_up_inverted" } predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))} return predictions # Create Gradio interface iface = gr.Interface( fn=hand_gesture_classification, inputs=gr.Image(type="numpy"), outputs=gr.Label(label="Prediction Scores"), title="Hand Gesture Classification", description="Upload an image to classify the hand gesture." ) # Launch the app if __name__ == "__main__": iface.launch() ``` # **Intended Use:** The **Hand-Gesture-19** model is designed to classify hand gesture images into different categories. Potential use cases include: - **Human-Computer Interaction:** Enabling gesture-based controls for devices. - **Sign Language Interpretation:** Assisting in recognizing sign language gestures. - **Gaming & VR:** Enhancing immersive experiences with hand gesture recognition. - **Robotics:** Facilitating gesture-based robotic control. - **Security & Surveillance:** Identifying gestures for access control and safety monitoring.