ProtoViT Model - deit_small_patch16_224 (PINS)

This is a fine-tuned deit_small_patch16_224 model trained on Pinterest Face Recognition Dataset from the paper "Interpretable Image Classification with Adaptive Prototype-based Vision Transformers".

Model Details

  • Base architecture: deit_small_patch16_224
  • Dataset: Pinterest Face Recognition Dataset
  • Number of classes: 155
  • Fine-tuned checkpoint: 14finetuned0.8042
  • Accuracy: 80.42%

Training Details

  • Number of prototypes: 1550
  • Prototype size: 1ร—1
  • Training process: Warm up โ†’ Joint training โ†’ Push โ†’ Last layer fine-tuning
  • Weight coefficients:
    • Cross entropy: 1.0
    • Clustering: -0.8
    • Separation: 0.1
    • L1: 0.01
    • Orthogonal: 0.001
    • Coherence: 0.003
  • Batch size: 128

Dataset Description

A face recognition dataset collected from Pinterest containing 155 different identity classes Dataset link: https://www.kaggle.com/datasets/hereisburak/pins-face-recognition

Usage

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

# Load model and processor
model = AutoModelForImageClassification.from_pretrained("Ayushnangia/protovit-deit_small_patch16_224-pins")
processor = AutoImageProcessor.from_pretrained("Ayushnangia/protovit-deit_small_patch16_224-pins")

# Prepare image
image = Image.open("path_to_your_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Make prediction
outputs = model(**inputs)
predicted_label = outputs.logits.argmax(-1).item()

Additional Information

Github repo by authors of the paper ![GitHub repository][https://github.com/Henrymachiyu/ProtoViT]

For more details about the implementation and training process, please visit the my fork of ProtoVit GitHub repository.

Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.