---
tags:
- vision
- clip
- fine-tuned
- PatchCamelyon
- medical-imaging
license: apache-2.0
library_name: transformers
model_type: clip_vision_model
datasets:
- 1aurent/PatchCamelyon
- lens-ai/adversarial_pcam

---

# ![LensAI Logo](https://static.wixstatic.com/media/a8a410_27dc826bddd34fb8a464a8434c53ab87~mv2.png/v1/fill/w_350,h_100,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/logolai.png)  
# Adversarial CLIP ViT Base Patch32 Fine-Tuned on PatchCamelyon (PCAM)

## Overview
-This repository contains a model trained on adversarial data of the [CLIP ViT Base Patch32 finetuned](https://huggingface.co/lens-ai/clip-vit-base-patch32_pcam_finetuned) model on the [PatchCamelyon (PCAM)](https://huggingface.co/datasets/1aurent/PatchCamelyon) dataset and also on [PatchCamelyon Adversarial(PCAM)](https://huggingface.co/datasets/lens-ai/adversarial_pcam) dataset.The model is optimized for histopathological image classification.

## 📌 Model Highlights

- **Model Type:** CLIP Vision Transformer (ViT-B/32) with classification head
- **Task:** Binary classification of histopathological images (cancer vs. non-cancer)
- **Base Model:** `openai/clip-vit-base-patch32`
- **Training Data:** PatchCamelyon (PCAM) and Adversarial PCAM datasets
- **Input:** RGB images (224x224 pixels)
- **Output:** Binary classification (cancer/non-cancer)

## 🚀 Key Results

### ✅ Clean Evaluation Metrics
- **Clean Accuracy:** 86.72%

### ⚔️ Adversarial Robustness (Fine-tuned Model)
- **PGD Attack:**
  - Success Rate: 17.87%
  - Average L2 Distance: 12.09
- **FGSM Attack:**
  - Success Rate: 17.38%
  - Average L2 Distance: 12.10
- **DeepFool Attack:**
  - Success Rate: 35.62%
  - Average L2 Distance: 234.13

### 📊 Base Model Comparison
- **Clean Accuracy:** 86.30%
- **PGD:** 50.10% Success Rate | Avg L2 Distance: 12.08
- **FGSM:** 44.14% Success Rate | Avg L2 Distance: 12.10
- **DeepFool:** 81.64% Success Rate | Avg L2 Distance: 224.66

**Hardware:** Trained on NVIDIA A100 GPU (5 epochs)

---

## 🔧 Usage

### Installation
```bash
pip install transformers torch safetensors
```

### Inference Example
```python
from transformers import CLIPVisionConfig, CLIPVisionModel, CLIPFeatureExtractor
import torch
from torch import nn

class PCamClassifier(nn.Module):
    def __init__(self, config_dict):
        super().__init__()
        self.config = CLIPVisionConfig(**config_dict)
        self.vision_model = CLIPVisionModel(self.config)
        self.classifier = nn.Linear(self.config.hidden_size, 2)

    def forward(self, pixel_values):
        outputs = self.vision_model(pixel_values)
        return self.classifier(outputs.pooler_output)

# Load model
config_dict = {
    "_name_or_path": "openai/clip-vit-base-patch32",
    "architectures": ["CLIPVisionModel"],
    "attention_dropout": 0.0,
    "dropout": 0.0,
    "hidden_act": "quick_gelu",
    "hidden_size": 768,
    "image_size": 224,
    "initializer_factor": 1.0,
    "initializer_range": 0.02,
    "intermediate_size": 3072,
    "layer_norm_eps": 1e-05,
    "model_type": "clip_vision_model",
    "num_attention_heads": 12,
    "num_channels": 3,
    "num_hidden_layers": 12,
    "patch_size": 32,
    "projection_dim": 512,
    "torch_dtype": "float32"
}

# Initialize model
model = PCamClassifier(config_dict)
model.load_state_dict(torch.load('best_enhanced_pcam_model.pt'))


class PCamDataset(Dataset):
    def __init__(self, dataset):
        self.dataset = dataset
        
    def __len__(self):
        return len(self.dataset)
        
    def __getitem__(self, idx):
        example = self.dataset[idx]
        image = example["image"].convert("RGB")
        image_array = np.array(image) / 255.0
        image_array = image_array.transpose(2, 0, 1).astype(np.float32)
        return {
            "pixel_values": image_array,
            "labels": example["label"]
        }

```

---

## 📊 Future Work
We plan to release:
- Enhanced robustness metrics
- Expanded adversarial attack evaluations

## 📜 License
Released under the Apache-2.0 License.

## 📬 Contact
For inquiries, please reach out to **Venkata Tej** at [LensAI](https://www.lensai.tech).