metadata
license: apache-2.0
datasets:
- cj-mills/hagrid-classification-512p-no-gesture-150k
language:
- en
base_model:
- google/siglip2-so400m-patch14-384
pipeline_tag: image-classification
library_name: transformers
tags:
- Gesture
- Classification
- SigLIP2
- 19:Styles
- Vision-Encoder
Classification Report:
precision recall f1-score support
call 0.9889 0.9739 0.9813 6939
dislike 0.9892 0.9863 0.9877 7028
fist 0.9956 0.9923 0.9940 6882
four 0.9632 0.9653 0.9643 7183
like 0.9668 0.9855 0.9760 6823
mute 0.9848 0.9976 0.9912 7139
no_gesture 0.9960 0.9957 0.9958 27823
ok 0.9872 0.9831 0.9852 6924
one 0.9817 0.9854 0.9835 7062
palm 0.9793 0.9848 0.9820 7050
peace 0.9723 0.9635 0.9679 6965
peace_inverted 0.9806 0.9836 0.9821 6876
rock 0.9853 0.9865 0.9859 6883
stop 0.9614 0.9901 0.9756 6893
stop_inverted 0.9933 0.9712 0.9821 7142
three 0.9712 0.9478 0.9594 6940
three2 0.9785 0.9799 0.9792 6870
two_up 0.9848 0.9863 0.9855 7346
two_up_inverted 0.9855 0.9871 0.9863 6967
accuracy 0.9833 153735
macro avg 0.9813 0.9814 0.9813 153735
weighted avg 0.9833 0.9833 0.9833 153735