Venkata Pydipalli commited on
Commit
243c72f
·
1 Parent(s): 1475aa5

Added Adversarial model.

Browse files
Files changed (4) hide show
  1. README.md +118 -0
  2. best_enhanced_pcam_model.pt +3 -0
  3. config.json +20 -0
  4. results.json +13 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - vision
4
+ - clip
5
+ - fine-tuned
6
+ - PatchCamelyon
7
+ - medical-imaging
8
+ license: apache-2.0
9
+ library_name: transformers
10
+ model_type: clip_vision_model
11
+ datasets:
12
+ - 1aurent/PatchCamelyon
13
+ ---
14
+
15
+ # CLIP ViT Base Patch32 Fine-Tuned on PatchCamelyon (PCAM)
16
+
17
+ ## Overview
18
+ This repository contains a model trained on adversarial data of the [CLIP ViT Base Patch32 finetuned](https://huggingface.co/lens-ai/clip-vit-base-patch32_pcam_finetuned) model on the [PatchCamelyon (PCAM)](https://huggingface.co/datasets/1aurent/PatchCamelyon) dataset and also on [PatchCamelyon Adversarial(PCAM)](https://huggingface.co/datasets/lens-ai/adversarial_pcam) dataset.The model is optimized for histopathological image classification.
19
+
20
+ ## Model Description
21
+
22
+ - **Model Type:** CLIP Vision Transformer (ViT-B/32) with classification head
23
+ - **Task:** Binary classification of histopathological images
24
+ - **Training Data:** PatchCamelyon dataset
25
+ - **Input:** RGB images of size 224x224 pixels
26
+ - **Output:** Binary classification (cancer/non-cancer)
27
+
28
+ ## Base Model Details
29
+ - **Base Model**: `openai/clip-vit-base-patch32_pcam_finetuned`
30
+ - **Fine-tuned for**: Medical image classification (tumor vs. non-tumor)
31
+
32
+ - **Base Model Evaluation Results Summary**:
33
+ - **Clean Accuracy: 86.30%**
34
+ - **PGD**:
35
+ **Success Rate: 50.10%**
36
+ **Average L2 Distance: 12.0844**
37
+ - **FGSM**:
38
+ **Success Rate: 44.14%**
39
+ **Average L2 Distance: 12.0957**
40
+ - **DeepFool**:
41
+ **Success Rate: 81.64%**
42
+ **Average L2 Distance: 224.6645**
43
+
44
+ - **Adversarial Model Evaluation Results**:
45
+ - **clean_accuracy**: 86.7218017578125,
46
+ - **epochs**: 5
47
+ - **attacks**
48
+ **PGD**
49
+ **success_rate**: 17.87109375,
50
+ **avg_l2_dist**: 12.093187361955643
51
+ **FGSM**:
52
+ **success_rate**: 17.3828125,
53
+ **avg_l2_dist**: 12.09616070985794
54
+ **DeepFool**:
55
+ **success_rate: 35.62109375,**
56
+ **avg_l2_dist: 234.12759065628052**
57
+
58
+ - **Hardware**: Trained on GPU-A100
59
+
60
+ ## Usage
61
+ ### Installation
62
+ Ensure you have `transformers`, `torch`, and `safetensors` installed:
63
+ ```bash
64
+ pip install transformers torch safetensors
65
+ ```
66
+
67
+ ```python
68
+ from transformers import CLIPVisionConfig, CLIPVisionModel, CLIPFeatureExtractor
69
+ import torch
70
+ from torch import nn
71
+
72
+ class PCamClassifier(nn.Module):
73
+ def __init__(self, config_dict):
74
+ super().__init__()
75
+ self.config = CLIPVisionConfig(**config_dict)
76
+ self.vision_model = CLIPVisionModel(self.config)
77
+ self.classifier = nn.Linear(self.config.hidden_size, 2)
78
+
79
+ def forward(self, pixel_values):
80
+ outputs = self.vision_model(pixel_values)
81
+ return self.classifier(outputs.pooler_output)
82
+
83
+ # Load model
84
+ config_dict = {
85
+ "_name_or_path": "openai/clip-vit-base-patch32",
86
+ "architectures": ["CLIPVisionModel"],
87
+ "attention_dropout": 0.0,
88
+ "dropout": 0.0,
89
+ "hidden_act": "quick_gelu",
90
+ "hidden_size": 768,
91
+ "image_size": 224,
92
+ "initializer_factor": 1.0,
93
+ "initializer_range": 0.02,
94
+ "intermediate_size": 3072,
95
+ "layer_norm_eps": 1e-05,
96
+ "model_type": "clip_vision_model",
97
+ "num_attention_heads": 12,
98
+ "num_channels": 3,
99
+ "num_hidden_layers": 12,
100
+ "patch_size": 32,
101
+ "projection_dim": 512,
102
+ "torch_dtype": "float32"
103
+ }
104
+
105
+ # Initialize model
106
+ model = PCamClassifier(config_dict)
107
+ model.load_state_dict(torch.load('best_enhanced_pcam_model.pt'))
108
+
109
+ ## Evaluation
110
+ We plan to release additional metrics, including robustness evaluation with adversarial attacks in future updates.
111
+
112
+ ## License
113
+ This model is released under the MIT License.
114
+
115
+ ## Contact
116
+ For any questions, please reach out to **Venkata Tej** at [LensAI](https://huggingface.co/lens-ai).
117
+
118
+
best_enhanced_pcam_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1996c94be4f041dd20cc8aa2684fb53b33925656b24192478a57a82ec59084d1
3
+ size 1049756882
config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "lens-ai/clip-vit-base-patch32_pcam_finetuned",
3
+ "architectures": ["CLIPVisionModel"],
4
+ "attention_dropout": 0.0,
5
+ "dropout": 0.0,
6
+ "hidden_act": "quick_gelu",
7
+ "hidden_size": 768,
8
+ "image_size": 224,
9
+ "initializer_factor": 1.0,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 3072,
12
+ "layer_norm_eps": 1e-05,
13
+ "model_type": "clip_vision_model",
14
+ "num_attention_heads": 12,
15
+ "num_channels": 3,
16
+ "num_hidden_layers": 12,
17
+ "patch_size": 32,
18
+ "projection_dim": 512,
19
+ "torch_dtype": "float32"
20
+ }
results.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_accuracy": 86.7218017578125,
3
+ "attacks": {
4
+ "PGD": {
5
+ "success_rate": 17.87109375,
6
+ "avg_l2_dist": 12.093187361955643
7
+ },
8
+ "FGSM": {
9
+ "success_rate": 17.3828125,
10
+ "avg_l2_dist": 12.09616070985794
11
+ }
12
+ }
13
+ }