Upload folder using huggingface_hub
Browse files- README.md +66 -0
- config.json +9 -0
- pytorch_model.bin +3 -0
README.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- image-classification
|
5 |
+
- vision-transformer
|
6 |
+
- protovit
|
7 |
+
- pins
|
8 |
+
license: mit
|
9 |
+
---
|
10 |
+
|
11 |
+
# ProtoViT Model - deit_small_patch16_224 (PINS)
|
12 |
+
|
13 |
+
This is a fine-tuned deit_small_patch16_224 model trained on Pinterest Face Recognition Dataset from the paper ["Interpretable Image Classification with Adaptive Prototype-based Vision Transformers"](https://arxiv.org/abs/2410.20722).
|
14 |
+
|
15 |
+
## Model Details
|
16 |
+
|
17 |
+
- Base architecture: deit_small_patch16_224
|
18 |
+
- Dataset: Pinterest Face Recognition Dataset
|
19 |
+
- Number of classes: 155
|
20 |
+
- Fine-tuned checkpoint: `14finetuned0.8042`
|
21 |
+
- Accuracy: 80.42%
|
22 |
+
|
23 |
+
## Training Details
|
24 |
+
|
25 |
+
- Number of prototypes: 2000
|
26 |
+
- Prototype size: 1×1
|
27 |
+
- Training process: Warm up → Joint training → Push → Last layer fine-tuning
|
28 |
+
- Weight coefficients:
|
29 |
+
- Cross entropy: 1.0
|
30 |
+
- Clustering: -0.8
|
31 |
+
- Separation: 0.1
|
32 |
+
- L1: 0.01
|
33 |
+
- Orthogonal: 0.001
|
34 |
+
- Coherence: 0.003
|
35 |
+
- Training set size: 70420
|
36 |
+
- Push set size: 13979
|
37 |
+
- Test set size: 3555
|
38 |
+
- Batch size: 128
|
39 |
+
|
40 |
+
## Dataset Description
|
41 |
+
|
42 |
+
A face recognition dataset collected from Pinterest containing 155 different identity classes
|
43 |
+
Dataset link: https://www.kaggle.com/datasets/hereisburak/pins-face-recognition
|
44 |
+
|
45 |
+
## Usage
|
46 |
+
|
47 |
+
```python
|
48 |
+
from transformers import AutoImageProcessor, AutoModelForImageClassification
|
49 |
+
from PIL import Image
|
50 |
+
|
51 |
+
# Load model and processor
|
52 |
+
model = AutoModelForImageClassification.from_pretrained("Ayushnangia/protovit-deit_small_patch16_224-pins")
|
53 |
+
processor = AutoImageProcessor.from_pretrained("Ayushnangia/protovit-deit_small_patch16_224-pins")
|
54 |
+
|
55 |
+
# Prepare image
|
56 |
+
image = Image.open("path_to_your_image.jpg")
|
57 |
+
inputs = processor(images=image, return_tensors="pt")
|
58 |
+
|
59 |
+
# Make prediction
|
60 |
+
outputs = model(**inputs)
|
61 |
+
predicted_label = outputs.logits.argmax(-1).item()
|
62 |
+
```
|
63 |
+
|
64 |
+
## Additional Information
|
65 |
+
|
66 |
+
For more details about the implementation and training process, please visit the [GitHub repository](https://github.com/ayushnangia/ProtoViT).
|
config.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"architectures": [
|
3 |
+
"DeiTForImageClassification"
|
4 |
+
],
|
5 |
+
"model_type": "deit",
|
6 |
+
"num_labels": 155,
|
7 |
+
"image_size": 224,
|
8 |
+
"patch_size": 16
|
9 |
+
}
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a31471e77940894f8e44d401e356c1115e70d811e21771c9dc086042560adb40
|
3 |
+
size 100553453
|