File size: 5,255 Bytes
dbc498c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
license: apache-2.0
tags:
- vision-transformer
- image-classification
- pytorch
- timm
- vit
- gravitational-lensing
- strong-lensing
- astronomy
- astrophysics
datasets:
- C21
metrics:
- accuracy
- auc
- f1
model-index:
- name: ViT-a2
  results:
  - task:
      type: image-classification
      name: Strong Gravitational Lens Discovery
    dataset:
      type: common-test-sample
      name: Common Test Sample (More et al. 2024)
    metrics:
    - type: accuracy
      value: 0.8205
      name: Average Accuracy
    - type: auc
      value: 0.8511
      name: Average AUC-ROC
    - type: f1
      value: 0.5319
      name: Average F1-Score
---

# 🌌 vit-gravit-a2

πŸ”­ This model is part of **GraViT**: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery

πŸ”— **GitHub Repository**: [https://github.com/parlange/gravit](https://github.com/parlange/gravit)

## πŸ›°οΈ Model Details

- **πŸ€– Model Type**: ViT
- **πŸ§ͺ Experiment**: A2 - C21-half
- **🌌 Dataset**: C21
- **πŸͺ Fine-tuning Strategy**: half



## πŸ’» Quick Start

```python
import torch
import timm

# Load the model directly from the Hub
model = timm.create_model(
    'hf-hub:parlange/vit-gravit-a2',
    pretrained=True
)
model.eval()

# Example inference
dummy_input = torch.randn(1, 3, 224, 224)
with torch.no_grad():
    output = model(dummy_input)
    predictions = torch.softmax(output, dim=1)
print(f"Lens probability: {predictions[0][1]:.4f}")
```

## ⚑️ Training Configuration

**Training Dataset:** C21 (CaΓ±ameras et al. 2021)  
**Fine-tuning Strategy:** half


| πŸ”§ Parameter | πŸ“ Value |
|--------------|----------|
| Batch Size | 192 |
| Learning Rate | AdamW with ReduceLROnPlateau |
| Epochs | 100 |
| Patience | 10 |
| Optimizer | AdamW |
| Scheduler | ReduceLROnPlateau |
| Image Size | 224x224 |
| Fine Tune Mode | half |
| Stochastic Depth Probability | 0.1 |


## πŸ“ˆ Training Curves

![Combined Training Metrics](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/training_curves/ViT_combined_metrics.png)


## 🏁 Final Epoch Training Metrics

| Metric | Training | Validation |
|:---------:|:-----------:|:-------------:|
| πŸ“‰ Loss | 0.0159 | 0.0354 |
| 🎯 Accuracy | 0.9939 | 0.9870 |
| πŸ“Š AUC-ROC | 0.9998 | 0.9986 |
| βš–οΈ F1 Score | 0.9939 | 0.9870 |


## β˜‘οΈ Evaluation Results

### ROC Curves and Confusion Matrices

Performance across all test datasets (a through l) in the Common Test Sample (More et al. 2024):

![ROC + Confusion Matrix - Dataset A](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_a.png)
![ROC + Confusion Matrix - Dataset B](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_b.png)
![ROC + Confusion Matrix - Dataset C](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_c.png)
![ROC + Confusion Matrix - Dataset D](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_d.png)
![ROC + Confusion Matrix - Dataset E](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_e.png)
![ROC + Confusion Matrix - Dataset F](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_f.png)
![ROC + Confusion Matrix - Dataset G](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_g.png)
![ROC + Confusion Matrix - Dataset H](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_h.png)
![ROC + Confusion Matrix - Dataset I](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_i.png)
![ROC + Confusion Matrix - Dataset J](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_j.png)
![ROC + Confusion Matrix - Dataset K](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_k.png)
![ROC + Confusion Matrix - Dataset L](https://huggingface.co/parlange/vit-gravit-a2/resolve/main/roc_confusion_matrix/ViT_roc_confusion_matrix_l.png)

### πŸ“‹ Performance Summary

Average performance across 12 test datasets from the Common Test Sample (More et al. 2024):

| Metric | Value |
|-----------|----------|
| 🎯 Average Accuracy | 0.8205 |
| πŸ“ˆ Average AUC-ROC | 0.8511 |
| βš–οΈ Average F1-Score | 0.5319 |


## πŸ“˜ Citation

If you use this model in your research, please cite:

```bibtex
@misc{parlange2025gravit,
      title={GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery}, 
      author={RenΓ© Parlange and Juan C. Cuevas-Tello and Octavio Valenzuela and Omar de J. Cabrera-Rosas and TomΓ‘s Verdugo and Anupreeta More and Anton T. Jaelani},
      year={2025},
      eprint={2509.00226},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.00226}, 
}
```

---


## Model Card Contact

For questions about this model, please contact the author through: https://github.com/parlange/