samuellimabraz commited on
Commit
54a2135
·
verified ·
1 Parent(s): ed7950a

Initial upload of Conditional-DETR signature detection model

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +26 -0
  2. README.md +354 -0
  3. best_checkpoint/config.json +61 -0
  4. best_checkpoint/model.safetensors +3 -0
  5. best_checkpoint/optimizer.pt +3 -0
  6. best_checkpoint/preprocessor_config.json +26 -0
  7. best_checkpoint/rng_state.pth +3 -0
  8. best_checkpoint/scheduler.pt +3 -0
  9. best_checkpoint/trainer_state.json +0 -0
  10. best_checkpoint/training_args.bin +3 -0
  11. config.json +61 -0
  12. eval/cpu/confusion_matrix.png +0 -0
  13. eval/cpu/inference_grid_0.png +3 -0
  14. eval/cpu/inference_grid_1.png +3 -0
  15. eval/cpu/inference_grid_10.png +3 -0
  16. eval/cpu/inference_grid_11.png +3 -0
  17. eval/cpu/inference_grid_12.png +0 -0
  18. eval/cpu/inference_grid_13.png +0 -0
  19. eval/cpu/inference_grid_14.png +0 -0
  20. eval/cpu/inference_grid_15.png +0 -0
  21. eval/cpu/inference_grid_16.png +3 -0
  22. eval/cpu/inference_grid_17.png +0 -0
  23. eval/cpu/inference_grid_18.png +0 -0
  24. eval/cpu/inference_grid_19.png +3 -0
  25. eval/cpu/inference_grid_2.png +3 -0
  26. eval/cpu/inference_grid_20.png +3 -0
  27. eval/cpu/inference_grid_21.png +0 -0
  28. eval/cpu/inference_grid_22.png +3 -0
  29. eval/cpu/inference_grid_23.png +3 -0
  30. eval/cpu/inference_grid_24.png +0 -0
  31. eval/cpu/inference_grid_3.png +0 -0
  32. eval/cpu/inference_grid_4.png +0 -0
  33. eval/cpu/inference_grid_5.png +3 -0
  34. eval/cpu/inference_grid_6.png +0 -0
  35. eval/cpu/inference_grid_7.png +0 -0
  36. eval/cpu/inference_grid_8.png +3 -0
  37. eval/cpu/inference_grid_9.png +3 -0
  38. eval/gpu/confusion_matrix.png +0 -0
  39. eval/gpu/inference_grid_0.png +3 -0
  40. eval/gpu/inference_grid_1.png +3 -0
  41. eval/gpu/inference_grid_10.png +3 -0
  42. eval/gpu/inference_grid_11.png +3 -0
  43. eval/gpu/inference_grid_12.png +0 -0
  44. eval/gpu/inference_grid_13.png +0 -0
  45. eval/gpu/inference_grid_14.png +0 -0
  46. eval/gpu/inference_grid_15.png +0 -0
  47. eval/gpu/inference_grid_16.png +3 -0
  48. eval/gpu/inference_grid_17.png +0 -0
  49. eval/gpu/inference_grid_18.png +0 -0
  50. eval/gpu/inference_grid_19.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,29 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ eval/cpu/inference_grid_0.png filter=lfs diff=lfs merge=lfs -text
37
+ eval/cpu/inference_grid_1.png filter=lfs diff=lfs merge=lfs -text
38
+ eval/cpu/inference_grid_10.png filter=lfs diff=lfs merge=lfs -text
39
+ eval/cpu/inference_grid_11.png filter=lfs diff=lfs merge=lfs -text
40
+ eval/cpu/inference_grid_16.png filter=lfs diff=lfs merge=lfs -text
41
+ eval/cpu/inference_grid_19.png filter=lfs diff=lfs merge=lfs -text
42
+ eval/cpu/inference_grid_2.png filter=lfs diff=lfs merge=lfs -text
43
+ eval/cpu/inference_grid_20.png filter=lfs diff=lfs merge=lfs -text
44
+ eval/cpu/inference_grid_22.png filter=lfs diff=lfs merge=lfs -text
45
+ eval/cpu/inference_grid_23.png filter=lfs diff=lfs merge=lfs -text
46
+ eval/cpu/inference_grid_5.png filter=lfs diff=lfs merge=lfs -text
47
+ eval/cpu/inference_grid_8.png filter=lfs diff=lfs merge=lfs -text
48
+ eval/cpu/inference_grid_9.png filter=lfs diff=lfs merge=lfs -text
49
+ eval/gpu/inference_grid_0.png filter=lfs diff=lfs merge=lfs -text
50
+ eval/gpu/inference_grid_1.png filter=lfs diff=lfs merge=lfs -text
51
+ eval/gpu/inference_grid_10.png filter=lfs diff=lfs merge=lfs -text
52
+ eval/gpu/inference_grid_11.png filter=lfs diff=lfs merge=lfs -text
53
+ eval/gpu/inference_grid_16.png filter=lfs diff=lfs merge=lfs -text
54
+ eval/gpu/inference_grid_19.png filter=lfs diff=lfs merge=lfs -text
55
+ eval/gpu/inference_grid_2.png filter=lfs diff=lfs merge=lfs -text
56
+ eval/gpu/inference_grid_20.png filter=lfs diff=lfs merge=lfs -text
57
+ eval/gpu/inference_grid_22.png filter=lfs diff=lfs merge=lfs -text
58
+ eval/gpu/inference_grid_23.png filter=lfs diff=lfs merge=lfs -text
59
+ eval/gpu/inference_grid_5.png filter=lfs diff=lfs merge=lfs -text
60
+ eval/gpu/inference_grid_8.png filter=lfs diff=lfs merge=lfs -text
61
+ eval/gpu/inference_grid_9.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,354 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - microsoft/conditional-detr-resnet-50
5
+ pipeline_tag: object-detection
6
+ datasets:
7
+ - tech4humans/signature-detection
8
+ metrics:
9
+ - f1
10
+ - precision
11
+ - recall
12
+ library_name: transformers
13
+ inference: false
14
+ tags:
15
+ - object-detection
16
+ - signature-detection
17
+ - detr
18
+ - conditional-detr
19
+ - pytorch
20
+ model-index:
21
+ - name: tech4humans/conditional-detr-50-signature-detector
22
+ results:
23
+ - task:
24
+ type: object-detection
25
+ dataset:
26
+ type: tech4humans/signature-detection
27
+ name: tech4humans/signature-detection
28
+ split: test
29
+ metrics:
30
+ - type: precision
31
+ value: 0.936524
32
33
+ - type: precision
34
+ value: 0.653321
35
+ name: [email protected]:0.95
36
+ ---
37
+
38
+ # **Conditional-DETR ResNet-50 - Handwritten Signature Detection**
39
+
40
+ This repository presents a Conditional-DETR model with ResNet-50 backbone, fine-tuned to detect handwritten signatures in document images. This model achieved the **highest [email protected] (93.65%)** among all tested architectures in our comprehensive evaluation.
41
+
42
+ | Resource | Links / Badges | Details |
43
+ |---------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
44
+ | **Article** | [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-md.svg)](https://huggingface.co/blog/samuellimabraz/signature-detection-model) | A detailed community article covering the full development process of the project |
45
+ | **Model Files (YOLOv8s)** | [![HF Model](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/tech4humans/yolov8s-signature-detector) | **Available formats:** [![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=flat&logo=PyTorch&logoColor=white)](https://pytorch.org/) [![ONNX](https://img.shields.io/badge/ONNX-005CED.svg?style=flat&logo=ONNX&logoColor=white)](https://onnx.ai/) [![TensorRT](https://img.shields.io/badge/TensorRT-76B900.svg?style=flat&logo=NVIDIA&logoColor=white)](https://developer.nvidia.com/tensorrt) |
46
+ | **Dataset – Original** | [![Roboflow](https://app.roboflow.com/images/download-dataset-badge.svg)](https://universe.roboflow.com/tech-ysdkk/signature-detection-hlx8j) | 2,819 document images annotated with signature coordinates |
47
+ | **Dataset – Processed** | [![HF Dataset](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md.svg)](https://huggingface.co/datasets/tech4humans/signature-detection) | Augmented and pre-processed version (640px) for model training |
48
+ | **Notebooks – Model Experiments** | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wSySw_zwyuv6XSaGmkngI4dwbj-hR4ix) [![W&B Training](https://img.shields.io/badge/W%26B_Training-FFBE00?style=flat&logo=WeightsAndBiases&logoColor=white)](https://api.wandb.ai/links/samuel-lima-tech4humans/30cmrkp8) | Complete training and evaluation pipeline with selection among different architectures (yolo, detr, rt-detr, conditional-detr, yolos) |
49
+ | **Notebooks – HP Tuning** | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wSySw_zwyuv6XSaGmkngI4dwbj-hR4ix) [![W&B HP Tuning](https://img.shields.io/badge/W%26B_HP_Tuning-FFBE00?style=flat&logo=WeightsAndBiases&logoColor=white)](https://api.wandb.ai/links/samuel-lima-tech4humans/31a6zhb1) | Optuna trials for optimizing the precision/recall balance |
50
+ | **Inference Server** | [![GitHub](https://img.shields.io/badge/Deploy-ffffff?style=for-the-badge&logo=github&logoColor=black)](https://github.com/tech4ai/t4ai-signature-detect-server) | Complete deployment and inference pipeline with Triton Inference Server<br> [![OpenVINO](https://img.shields.io/badge/OpenVINO-00c7fd?style=flat&logo=intel&logoColor=white)](https://docs.openvino.ai/2025/index.html) [![Docker](https://img.shields.io/badge/Docker-2496ED?logo=docker&logoColor=fff)](https://www.docker.com/) [![Triton](https://img.shields.io/badge/Triton-Inference%20Server-76B900?labelColor=black&logo=nvidia)](https://developer.nvidia.com/triton-inference-server) |
51
+ | **Live Demo** | [![HF Space](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md.svg)](https://huggingface.co/spaces/tech4humans/signature-detection) | Graphical interface with real-time inference<br> [![Gradio](https://img.shields.io/badge/Gradio-FF5722?style=flat&logo=Gradio&logoColor=white)](https://www.gradio.app/) [![Plotly](https://img.shields.io/badge/PLotly-000000?style=flat&logo=plotly&logoColor=white)](https://plotly.com/python/) |
52
+
53
+ ---
54
+
55
+ ---
56
+
57
+ ## **Dataset**
58
+
59
+ <table>
60
+ <tr>
61
+ <td style="text-align: center; padding: 10px;">
62
+ <a href="https://universe.roboflow.com/tech-ysdkk/signature-detection-hlx8j">
63
+ <img src="https://app.roboflow.com/images/download-dataset-badge.svg">
64
+ </a>
65
+ </td>
66
+ <td style="text-align: center; padding: 10px;">
67
+ <a href="https://huggingface.co/datasets/tech4humans/signature-detection">
68
+ <img src="https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md-dark.svg" alt="Dataset on HF">
69
+ </a>
70
+ </td>
71
+ </tr>
72
+ </table>
73
+ The training utilized a dataset built from two public datasets: [Tobacco800](https://paperswithcode.com/dataset/tobacco-800) and [signatures-xc8up](https://universe.roboflow.com/roboflow-100/signatures-xc8up), unified and processed in [Roboflow](https://roboflow.com/).
74
+
75
+ **Dataset Summary:**
76
+ - Training: 1,980 images (70%)
77
+ - Validation: 420 images (15%)
78
+ - Testing: 419 images (15%)
79
+ - Format: COCO JSON
80
+ - Resolution: 640x640 pixels
81
+
82
+ ![Roboflow Dataset](./assets/roboflow_ds.png)
83
+
84
+ ---
85
+
86
+ ## **Training Process**
87
+
88
+ The training process involved the following steps:
89
+
90
+ ### 1. **Model Selection:**
91
+
92
+ Various object detection models were evaluated to identify the best balance between precision, recall, and inference time.
93
+
94
+
95
+ | **Metric** | [rtdetr-l](https://github.com/ultralytics/assets/releases/download/v8.2.0/rtdetr-l.pt) | [yolos-base](https://huggingface.co/hustvl/yolos-base) | [yolos-tiny](https://huggingface.co/hustvl/yolos-tiny) | [conditional-detr-resnet-50](https://huggingface.co/microsoft/conditional-detr-resnet-50) | [detr-resnet-50](https://huggingface.co/facebook/detr-resnet-50) | [yolov8x](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x.pt) | [yolov8l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l.pt) | [yolov8m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m.pt) | [yolov8s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt) | [yolov8n](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt) | [yolo11x](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11x.pt) | [yolo11l](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11l.pt) | [yolo11m](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11m.pt) | [yolo11s](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt) | [yolo11n](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt) | [yolov10x](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10x.pt) | [yolov10l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10l.pt) | [yolov10b](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10b.pt) | [yolov10m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10m.pt) | [yolov10s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10s.pt) | [yolov10n](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov10n.pt) |
96
+ |:---------------------|---------:|-----------:|-----------:|---------------------------:|---------------:|--------:|--------:|--------:|--------:|--------:|--------:|--------:|--------:|--------:|--------:|---------:|---------:|---------:|---------:|---------:|---------:|
97
+ | **Inference Time - CPU (ms)** | 583.608 | 1706.49 | 265.346 | 476.831 | 425.649 | 1259.47 | 871.329 | 401.183 | 216.6 | 110.442 | 1016.68 | 518.147 | 381.652 | 179.792 | 106.656 | 821.183 | 580.767 | 473.109 | 320.12 | 150.076 | **73.8596** |
98
+ | **mAP50** | 0.92709 | 0.901154 | 0.869814 | **0.936524** | 0.88885 | 0.794237| 0.800312| 0.875322| 0.874721| 0.816089| 0.667074| 0.707409| 0.809557| 0.835605| 0.813799| 0.681023| 0.726802| 0.789835| 0.787688| 0.663877| 0.734332 |
99
+ | **mAP50-95** | 0.622364 | 0.583569 | 0.469064 | 0.653321 | 0.579428 | 0.552919| 0.593976| **0.665495**| 0.65457 | 0.623963| 0.482289| 0.499126| 0.600797| 0.638849| 0.617496| 0.474535| 0.522654| 0.578874| 0.581259| 0.473857| 0.552704 |
100
+
101
+
102
+ ![Model Selection](./assets/model_selection.png)
103
+
104
+ #### Highlights:
105
+ - **Best mAP50:** `conditional-detr-resnet-50` (**0.936524**)
106
+ - **Best mAP50-95:** `yolov8m` (**0.665495**)
107
+ - **Fastest Inference Time:** `yolov10n` (**73.8596 ms**)
108
+
109
+ Detailed experiments are available on [**Weights & Biases**](https://api.wandb.ai/links/samuel-lima-tech4humans/30cmrkp8).
110
+
111
+ ### 2. **Hyperparameter Tuning:**
112
+
113
+ The YOLOv8s model, which demonstrated a good balance of inference time, precision, and recall, was selected for hyperparameter tuning.
114
+
115
+ [Optuna](https://optuna.org/) was used for 20 optimization trials.
116
+ The hyperparameter tuning used the following parameter configuration:
117
+
118
+ ```python
119
+ dropout = trial.suggest_float("dropout", 0.0, 0.5, step=0.1)
120
+ lr0 = trial.suggest_float("lr0", 1e-5, 1e-1, log=True)
121
+ box = trial.suggest_float("box", 3.0, 7.0, step=1.0)
122
+ cls = trial.suggest_float("cls", 0.5, 1.5, step=0.2)
123
+ opt = trial.suggest_categorical("optimizer", ["AdamW", "RMSProp"])
124
+ ```
125
+ Results can be visualized here: [**Hypertuning Experiment**](https://api.wandb.ai/links/samuel-lima-tech4humans/31a6zhb1).
126
+
127
+ ![Hypertuning Sweep](./assets/sweep.png)
128
+
129
+ ### 3. **Evaluation:**
130
+
131
+ The models were evaluated on the test set at the end of training in ONNX (CPU) and TensorRT (GPU - T4) formats. Performance metrics included precision, recall, mAP50, and mAP50-95.
132
+
133
+ ![Trials](./assets/trials.png)
134
+
135
+ #### Results Comparison:
136
+
137
+ | Metric | Base Model | Best Trial (#10) | Difference |
138
+ |------------|------------|-------------------|-------------|
139
+ | mAP50 | 87.47% | **95.75%** | +8.28% |
140
+ | mAP50-95 | 65.46% | **66.26%** | +0.81% |
141
+ | Precision | **97.23%** | 95.61% | -1.63% |
142
+ | Recall | 76.16% | **91.21%** | +15.05% |
143
+ | F1-score | 85.42% | **93.36%** | +7.94% |
144
+
145
+ ---
146
+
147
+ ## **Results**
148
+
149
+ After hyperparameter tuning of the YOLOv8s model, the best model achieved the following results on the test set:
150
+
151
+ - **Precision:** 94.74%
152
+ - **Recall:** 89.72%
153
+ - **mAP@50:** 94.50%
154
+ - **mAP@50-95:** 67.35%
155
+ - **Inference Time:**
156
+ - **ONNX Runtime (CPU):** 171.56 ms
157
+ - **TensorRT (GPU - T4):** 7.657 ms
158
+
159
+ ---
160
+
161
+ ## **How to Use**
162
+
163
+ ### **Installation**
164
+
165
+ ```bash
166
+ pip install transformers torch torchvision pillow
167
+ ```
168
+
169
+ ### **Inference**
170
+
171
+ ```python
172
+ from transformers import AutoImageProcessor, AutoModelForObjectDetection
173
+ from PIL import Image
174
+ import torch
175
+
176
+ # Load model and processor
177
+ model_name = "tech4humans/conditional-detr-50-signature-detector"
178
+ processor = AutoImageProcessor.from_pretrained(model_name)
179
+ model = AutoModelForObjectDetection.from_pretrained(model_name)
180
+
181
+ # Load and process image
182
+ image = Image.open("path/to/your/document.jpg")
183
+ inputs = processor(images=image, return_tensors="pt")
184
+
185
+ # Run inference
186
+ with torch.no_grad():
187
+ outputs = model(**inputs)
188
+
189
+ # Post-process results
190
+ target_sizes = torch.tensor([image.size[::-1]])
191
+ results = processor.post_process_object_detection(
192
+ outputs, target_sizes=target_sizes, threshold=0.5
193
+ )[0]
194
+
195
+ # Extract detections
196
+ for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
197
+ box = [round(i, 2) for i in box.tolist()]
198
+ print(f"Detected signature with confidence {round(score.item(), 3)} at location {box}")
199
+ ```
200
+
201
+ ### **Visualization**
202
+
203
+ ```python
204
+ import matplotlib.pyplot as plt
205
+ import matplotlib.patches as patches
206
+ from PIL import Image
207
+
208
+ def visualize_predictions(image_path, results, threshold=0.5):
209
+ image = Image.open(image_path)
210
+ fig, ax = plt.subplots(1, figsize=(12, 9))
211
+ ax.imshow(image)
212
+
213
+ for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
214
+ if score > threshold:
215
+ x, y, x2, y2 = box.tolist()
216
+ width, height = x2 - x, y2 - y
217
+
218
+ rect = patches.Rectangle(
219
+ (x, y), width, height,
220
+ linewidth=2, edgecolor='red', facecolor='none'
221
+ )
222
+ ax.add_patch(rect)
223
+ ax.text(x, y-10, f'Signature: {score:.3f}',
224
+ bbox=dict(boxstyle="round,pad=0.3", facecolor="yellow", alpha=0.7))
225
+
226
+ ax.set_title("Signature Detection Results")
227
+ plt.axis('off')
228
+ plt.show()
229
+
230
+ # Use the visualization
231
+ visualize_predictions("path/to/your/document.jpg", results)
232
+ ```
233
+
234
+ ---
235
+
236
+ ## **Demo**
237
+
238
+ You can explore the model and test real-time inference in the Hugging Face Spaces demo, built with Gradio and ONNXRuntime.
239
+
240
+ [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md.svg)](https://huggingface.co/spaces/tech4humans/signature-detection)
241
+
242
+ ---
243
+
244
+ ## 🔗 **Inference with Triton Server**
245
+
246
+ If you want to deploy this signature detection model in a production environment, check out our inference server repository based on the NVIDIA Triton Inference Server.
247
+
248
+ <table>
249
+ <tr>
250
+ <td>
251
+ <a href="https://github.com/triton-inference-server/server"><img src="https://img.shields.io/badge/Triton-Inference%20Server-76B900?style=for-the-badge&labelColor=black&logo=nvidia" alt="Triton Badge" /></a>
252
+ </td>
253
+ <td>
254
+ <a href="https://github.com/tech4ai/t4ai-signature-detect-server"><img src="https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white" alt="GitHub Badge" /></a>
255
+ </td>
256
+ </tr>
257
+ </table>
258
+ ---
259
+
260
+ ## **Infrastructure**
261
+
262
+ ### Software
263
+
264
+ The model was trained and tuned using a Jupyter Notebook environment.
265
+
266
+ - **Operating System:** Ubuntu 22.04
267
+ - **Python:** 3.10.12
268
+ - **PyTorch:** 2.5.1+cu121
269
+ - **Ultralytics:** 8.3.58
270
+ - **Roboflow:** 1.1.50
271
+ - **Optuna:** 4.1.0
272
+ - **ONNX Runtime:** 1.20.1
273
+ - **TensorRT:** 10.7.0
274
+
275
+ ### Hardware
276
+
277
+ Training was performed on a Google Cloud Platform n1-standard-8 instance with the following specifications:
278
+
279
+ - **CPU:** 8 vCPUs
280
+ - **GPU:** NVIDIA Tesla T4
281
+
282
+ ---
283
+
284
+ ## **License**
285
+
286
+ ### Model Weights, Code and Training Materials – **Apache 2.0**
287
+ - **License:** Apache License 2.0
288
+ - **Usage:** All training scripts, deployment code, and usage instructions are licensed under the Apache 2.0 license.
289
+
290
+ ---
291
+
292
+ ## **Citation**
293
+
294
+ If you use this model in your research, please cite:
295
+
296
+ ```bibtex
297
+ @misc{lima2024conditional-detr-signature-detection,
298
+ title={Conditional-DETR for Handwritten Signature Detection},
299
+ author={Lima, Samuel and Tech4Humans Team},
300
+ year={2024},
301
+ publisher={Hugging Face},
302
+ url={https://huggingface.co/tech4humans/conditional-detr-50-signature-detector}
303
+ }
304
+ ```
305
+
306
+ ---
307
+
308
+ ## **Contact and Information**
309
+
310
+ For further information, questions, or contributions, contact us at **[email protected]**.
311
+
312
+ <div align="center">
313
+ <p>
314
+ 📧 <b>Email:</b> <a href="mailto:[email protected]">[email protected]</a><br>
315
+ 🌐 <b>Website:</b> <a href="https://www.tech4.ai/">www.tech4.ai</a><br>
316
+ 💼 <b>LinkedIn:</b> <a href="https://www.linkedin.com/company/tech4humans-hyperautomation/">Tech4Humans</a>
317
+ </p>
318
+ </div>
319
+
320
+ ## **Author**
321
+
322
+ <div align="center">
323
+ <table>
324
+ <tr>
325
+ <td align="center" width="140">
326
+ <a href="https://huggingface.co/samuellimabraz">
327
+ <img src="https://avatars.githubusercontent.com/u/115582014?s=400&u=c149baf46c51fdee45ad5344cf1b360236d90d09&v=4" width="120" alt="Samuel Lima"/>
328
+ <h3>Samuel Lima</h3>
329
+ </a>
330
+ <p><i>AI Research Engineer</i></p>
331
+ <p>
332
+ <a href="https://huggingface.co/samuellimabraz">
333
+ <img src="https://img.shields.io/badge/🤗_HuggingFace-samuellimabraz-orange" alt="HuggingFace"/>
334
+ </a>
335
+ </p>
336
+ </td>
337
+ <td width="500">
338
+ <h4>Responsibilities in this Project</h4>
339
+ <ul>
340
+ <li>🔬 Model development and training</li>
341
+ <li>📊 Dataset analysis and processing</li>
342
+ <li>⚙️ Architecture selection and performance evaluation</li>
343
+ <li>📝 Technical documentation and model card</li>
344
+ </ul>
345
+ </td>
346
+ </tr>
347
+ </table>
348
+ </div>
349
+
350
+ ---
351
+
352
+ <div align="center">
353
+ <p>Developed with 💜 by <a href="https://www.tech4.ai/">Tech4Humans</a></p>
354
+ </div>
best_checkpoint/config.json ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/conditional-detr-resnet-50",
3
+ "activation_dropout": 0.0,
4
+ "activation_function": "relu",
5
+ "architectures": [
6
+ "ConditionalDetrForObjectDetection"
7
+ ],
8
+ "attention_dropout": 0.0,
9
+ "auxiliary_loss": false,
10
+ "backbone": "resnet50",
11
+ "backbone_config": null,
12
+ "backbone_kwargs": {
13
+ "in_chans": 3,
14
+ "out_indices": [
15
+ 1,
16
+ 2,
17
+ 3,
18
+ 4
19
+ ]
20
+ },
21
+ "bbox_cost": 5,
22
+ "bbox_loss_coefficient": 5,
23
+ "class_cost": 2,
24
+ "cls_loss_coefficient": 2,
25
+ "d_model": 256,
26
+ "decoder_attention_heads": 8,
27
+ "decoder_ffn_dim": 2048,
28
+ "decoder_layerdrop": 0.0,
29
+ "decoder_layers": 6,
30
+ "dice_loss_coefficient": 1,
31
+ "dilation": false,
32
+ "dropout": 0.1,
33
+ "encoder_attention_heads": 8,
34
+ "encoder_ffn_dim": 2048,
35
+ "encoder_layerdrop": 0.0,
36
+ "encoder_layers": 6,
37
+ "focal_alpha": 0.25,
38
+ "giou_cost": 2,
39
+ "giou_loss_coefficient": 2,
40
+ "id2label": {
41
+ "0": "signature"
42
+ },
43
+ "init_std": 0.02,
44
+ "init_xavier_std": 1.0,
45
+ "is_encoder_decoder": true,
46
+ "label2id": {
47
+ "signature": 0
48
+ },
49
+ "mask_loss_coefficient": 1,
50
+ "max_position_embeddings": 1024,
51
+ "model_type": "conditional_detr",
52
+ "num_channels": 3,
53
+ "num_hidden_layers": 6,
54
+ "num_queries": 300,
55
+ "position_embedding_type": "sine",
56
+ "scale_embedding": false,
57
+ "torch_dtype": "float32",
58
+ "transformers_version": "4.46.3",
59
+ "use_pretrained_backbone": true,
60
+ "use_timm_backbone": true
61
+ }
best_checkpoint/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b804b3797a81dbaa7f803c93ddff884acb321b10f3ad2520861b378e72cb3ef
3
+ size 174075684
best_checkpoint/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60667f62d23d0156209d0db0cd48fc1bf1aaaabf2f564a2cf22aa304543eecd0
3
+ size 345689625
best_checkpoint/preprocessor_config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_convert_annotations": true,
3
+ "do_normalize": true,
4
+ "do_pad": true,
5
+ "do_rescale": true,
6
+ "do_resize": true,
7
+ "format": "coco_detection",
8
+ "image_mean": [
9
+ 0.485,
10
+ 0.456,
11
+ 0.406
12
+ ],
13
+ "image_processor_type": "ConditionalDetrImageProcessor",
14
+ "image_std": [
15
+ 0.229,
16
+ 0.224,
17
+ 0.225
18
+ ],
19
+ "pad_size": null,
20
+ "resample": 2,
21
+ "rescale_factor": 0.00392156862745098,
22
+ "size": {
23
+ "height": 640,
24
+ "width": 640
25
+ }
26
+ }
best_checkpoint/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:672f61b91e1dc0ec0cfc7cc6bea9c0630fa1b53fe3a606869eead6061469864c
3
+ size 14244
best_checkpoint/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73201c99891272e8d20ef63730f93b9b956d012d7aefe414a361a43f9b574909
3
+ size 1064
best_checkpoint/trainer_state.json ADDED
The diff for this file is too large to render. See raw diff
 
best_checkpoint/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3706f9f79f5744209c871ccf9fbee60fa5a8e284a17427199064284853941395
3
+ size 5496
config.json ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/conditional-detr-resnet-50",
3
+ "activation_dropout": 0.0,
4
+ "activation_function": "relu",
5
+ "architectures": [
6
+ "ConditionalDetrForObjectDetection"
7
+ ],
8
+ "attention_dropout": 0.0,
9
+ "auxiliary_loss": false,
10
+ "backbone": "resnet50",
11
+ "backbone_config": null,
12
+ "backbone_kwargs": {
13
+ "in_chans": 3,
14
+ "out_indices": [
15
+ 1,
16
+ 2,
17
+ 3,
18
+ 4
19
+ ]
20
+ },
21
+ "bbox_cost": 5,
22
+ "bbox_loss_coefficient": 5,
23
+ "class_cost": 2,
24
+ "cls_loss_coefficient": 2,
25
+ "d_model": 256,
26
+ "decoder_attention_heads": 8,
27
+ "decoder_ffn_dim": 2048,
28
+ "decoder_layerdrop": 0.0,
29
+ "decoder_layers": 6,
30
+ "dice_loss_coefficient": 1,
31
+ "dilation": false,
32
+ "dropout": 0.1,
33
+ "encoder_attention_heads": 8,
34
+ "encoder_ffn_dim": 2048,
35
+ "encoder_layerdrop": 0.0,
36
+ "encoder_layers": 6,
37
+ "focal_alpha": 0.25,
38
+ "giou_cost": 2,
39
+ "giou_loss_coefficient": 2,
40
+ "id2label": {
41
+ "0": "signature"
42
+ },
43
+ "init_std": 0.02,
44
+ "init_xavier_std": 1.0,
45
+ "is_encoder_decoder": true,
46
+ "label2id": {
47
+ "signature": 0
48
+ },
49
+ "mask_loss_coefficient": 1,
50
+ "max_position_embeddings": 1024,
51
+ "model_type": "conditional_detr",
52
+ "num_channels": 3,
53
+ "num_hidden_layers": 6,
54
+ "num_queries": 300,
55
+ "position_embedding_type": "sine",
56
+ "scale_embedding": false,
57
+ "torch_dtype": "float32",
58
+ "transformers_version": "4.46.3",
59
+ "use_pretrained_backbone": true,
60
+ "use_timm_backbone": true
61
+ }
eval/cpu/confusion_matrix.png ADDED
eval/cpu/inference_grid_0.png ADDED

Git LFS Details

  • SHA256: 531b5f201a53888a78883489ebe2b4abcedb73829aca2838a925d4c003917e33
  • Pointer size: 131 Bytes
  • Size of remote file: 116 kB
eval/cpu/inference_grid_1.png ADDED

Git LFS Details

  • SHA256: 75a74c8f9e0be541121074e1146d26e64ff84b46b76fa673a2f23d5358babb65
  • Pointer size: 131 Bytes
  • Size of remote file: 130 kB
eval/cpu/inference_grid_10.png ADDED

Git LFS Details

  • SHA256: 39e12511e3892731bfe17542eacd66cd59fe55860d88b0f5eeb027eb14b50fd9
  • Pointer size: 131 Bytes
  • Size of remote file: 111 kB
eval/cpu/inference_grid_11.png ADDED

Git LFS Details

  • SHA256: 10023be3aacc95f06ba3ad7eea15f11762dde10ce82728f23d76cc6b83df34b0
  • Pointer size: 131 Bytes
  • Size of remote file: 140 kB
eval/cpu/inference_grid_12.png ADDED
eval/cpu/inference_grid_13.png ADDED
eval/cpu/inference_grid_14.png ADDED
eval/cpu/inference_grid_15.png ADDED
eval/cpu/inference_grid_16.png ADDED

Git LFS Details

  • SHA256: 1c16506561145145a7c3d67d326c8fbecaa9c3db93b052e544a100a4a6f77289
  • Pointer size: 131 Bytes
  • Size of remote file: 144 kB
eval/cpu/inference_grid_17.png ADDED
eval/cpu/inference_grid_18.png ADDED
eval/cpu/inference_grid_19.png ADDED

Git LFS Details

  • SHA256: 8db0a3330ad0555c48b7fa9e5653d96a6068cee7a61b803a85fe6baba0b887e3
  • Pointer size: 131 Bytes
  • Size of remote file: 113 kB
eval/cpu/inference_grid_2.png ADDED

Git LFS Details

  • SHA256: 08c04041c23c4290b35614cd463e5f3d94fbd1f4130bf1beec2e87a7a136cf38
  • Pointer size: 131 Bytes
  • Size of remote file: 101 kB
eval/cpu/inference_grid_20.png ADDED

Git LFS Details

  • SHA256: 9045dc42b8df8df426cd0544df2f6a287ad62fffa5d43fd47931419a57f6004e
  • Pointer size: 131 Bytes
  • Size of remote file: 158 kB
eval/cpu/inference_grid_21.png ADDED
eval/cpu/inference_grid_22.png ADDED

Git LFS Details

  • SHA256: 21de7020d60fd497f31ba954ff35fcced26fb75cf2224ad10049b1046bee202f
  • Pointer size: 131 Bytes
  • Size of remote file: 116 kB
eval/cpu/inference_grid_23.png ADDED

Git LFS Details

  • SHA256: 84b4b553806a31b7fe66a330087d3a2dc1fb23a5037f15e4c35c30cbb15acdba
  • Pointer size: 131 Bytes
  • Size of remote file: 145 kB
eval/cpu/inference_grid_24.png ADDED
eval/cpu/inference_grid_3.png ADDED
eval/cpu/inference_grid_4.png ADDED
eval/cpu/inference_grid_5.png ADDED

Git LFS Details

  • SHA256: 8263af9871e524c9745f2160e83e9eb8009349c8847c05b45f8a62f7e267b999
  • Pointer size: 131 Bytes
  • Size of remote file: 110 kB
eval/cpu/inference_grid_6.png ADDED
eval/cpu/inference_grid_7.png ADDED
eval/cpu/inference_grid_8.png ADDED

Git LFS Details

  • SHA256: 29b7bd53402eb2c8fdfae4bea951cb590cef2b16349f072024966417f83b55f7
  • Pointer size: 131 Bytes
  • Size of remote file: 115 kB
eval/cpu/inference_grid_9.png ADDED

Git LFS Details

  • SHA256: f1fc49bd0f4cc40091408d4c03a317ec602e89bc72d614abb17cc44971d99973
  • Pointer size: 131 Bytes
  • Size of remote file: 145 kB
eval/gpu/confusion_matrix.png ADDED
eval/gpu/inference_grid_0.png ADDED

Git LFS Details

  • SHA256: 531b5f201a53888a78883489ebe2b4abcedb73829aca2838a925d4c003917e33
  • Pointer size: 131 Bytes
  • Size of remote file: 116 kB
eval/gpu/inference_grid_1.png ADDED

Git LFS Details

  • SHA256: 75a74c8f9e0be541121074e1146d26e64ff84b46b76fa673a2f23d5358babb65
  • Pointer size: 131 Bytes
  • Size of remote file: 130 kB
eval/gpu/inference_grid_10.png ADDED

Git LFS Details

  • SHA256: 39e12511e3892731bfe17542eacd66cd59fe55860d88b0f5eeb027eb14b50fd9
  • Pointer size: 131 Bytes
  • Size of remote file: 111 kB
eval/gpu/inference_grid_11.png ADDED

Git LFS Details

  • SHA256: 10023be3aacc95f06ba3ad7eea15f11762dde10ce82728f23d76cc6b83df34b0
  • Pointer size: 131 Bytes
  • Size of remote file: 140 kB
eval/gpu/inference_grid_12.png ADDED
eval/gpu/inference_grid_13.png ADDED
eval/gpu/inference_grid_14.png ADDED
eval/gpu/inference_grid_15.png ADDED
eval/gpu/inference_grid_16.png ADDED

Git LFS Details

  • SHA256: 1c16506561145145a7c3d67d326c8fbecaa9c3db93b052e544a100a4a6f77289
  • Pointer size: 131 Bytes
  • Size of remote file: 144 kB
eval/gpu/inference_grid_17.png ADDED
eval/gpu/inference_grid_18.png ADDED
eval/gpu/inference_grid_19.png ADDED

Git LFS Details

  • SHA256: 8db0a3330ad0555c48b7fa9e5653d96a6068cee7a61b803a85fe6baba0b887e3
  • Pointer size: 131 Bytes
  • Size of remote file: 113 kB