kurakula-Prashanth2004 commited on
Commit
e5b428d
Β·
verified Β·
1 Parent(s): 7ecf200

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +369 -0
README.md CHANGED
@@ -5,6 +5,375 @@ colorFrom: pink
5
  colorTo: indigo
6
  sdk: docker
7
  pinned: false
 
8
  ---
9
 
10
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  colorTo: indigo
6
  sdk: docker
7
  pinned: false
8
+ license: mit
9
  ---
10
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
12
+
13
+ # Door & Window Detection using YOLOv8
14
+
15
+ A custom-trained YOLOv8 model for detecting doors and windows in construction blueprint-style images, deployed as a FastAPI service with dual response modes.
16
+
17
+ ## πŸš€ Demo
18
+
19
+ **Live API**: [https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection](https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection)
20
+
21
+ **GitHub Repository**: [https://github.com/kurakula-prashanth/door-window-detection](https://github.com/kurakula-prashanth/door-window-detection)
22
+
23
+ ## πŸ“‹ Project Overview
24
+
25
+ This project implements a complete machine learning pipeline for detecting doors and windows in architectural blueprints:
26
+
27
+ 1. **Manual Data Labeling** - Created custom dataset with bounding box annotations
28
+ 2. **Model Training** - Trained YOLOv8 model from scratch using only custom-labeled data
29
+ 3. **API Development** - Built FastAPI service with dual response modes (JSON + annotated images)
30
+ 4. **Deployment** - Deployed to Hugging Face Spaces with Docker
31
+
32
+ ## 🎯 Classes Detected
33
+
34
+ - `door` - Door symbols in blueprints
35
+ - `window` - Window symbols in blueprints
36
+
37
+ ## ✨ Key Features
38
+
39
+ - **Dual Response Modes**: Get JSON data or annotated images
40
+ - **Interactive Swagger UI**: Built-in API documentation at `/docs`
41
+ - **Smart Image Processing**: Automatic resizing for large images (max 1280px)
42
+ - **GPU Acceleration**: CUDA support with FP16 precision
43
+ - **Async Processing**: Non-blocking inference with ThreadPoolExecutor
44
+ - **Dynamic Color Coding**: Consistent colors for each detection class
45
+ - **Confidence Filtering**: Configurable confidence thresholds (default: 0.5)
46
+
47
+ ## πŸ› οΈ Setup & Installation
48
+
49
+ ### Local Development
50
+
51
+ 1. **Clone the repository**
52
+ ```bash
53
+ git clone https://github.com/kurakula-prashanth/door-window-detection.git
54
+ cd door-window-detection
55
+ ```
56
+
57
+ 2. **Create virtual environment**
58
+ ```bash
59
+ python3.12 -m venv yolo8_custom
60
+ source yolo8_custom/bin/activate # On Windows: yolo8_custom\Scripts\activate
61
+ ```
62
+
63
+ 3. **Install dependencies**
64
+ ```bash
65
+ pip install -r requirements.txt
66
+ ```
67
+
68
+ 4. **Run the API locally**
69
+ ```bash
70
+ uvicorn app:app --host 0.0.0.0 --port 8000 --reload
71
+ ```
72
+
73
+ 5. **Access the API**
74
+ - **Interactive Documentation**: http://localhost:8000/docs
75
+ - **API Endpoint**: http://localhost:8000/predict
76
+
77
+ ## πŸ“Š Training Process
78
+
79
+ ### Step 1: Data Labeling
80
+ - Used **LabelImg** for manual annotation
81
+ - Labeled 15-20 construction blueprint images
82
+ - Created bounding boxes for doors and windows only
83
+ - Generated YOLO format labels (.txt files)
84
+
85
+ ![Labeling image using labelImg - 1](https://github.com/user-attachments/assets/609fd6ee-fcc7-4c6a-973b-6c539e8515c5)
86
+
87
+ ![Labeling image using labelImg - 2](https://github.com/user-attachments/assets/3666c451-8bc4-4d57-9ffa-48611deca6d3)
88
+
89
+ ![Labeling image using labelImg - 3](https://github.com/user-attachments/assets/2f5f23fb-1086-412f-82a1-f1ff5e24dd75)
90
+
91
+ ![Labeling image using labelImg - 4](https://github.com/user-attachments/assets/8bccf20e-d5dc-4d1b-923b-7f603d64f5d2)
92
+
93
+ ### Step 2: Model Training
94
+ ```bash
95
+ yolo task=detect mode=train epochs=100 data=data_custom.yaml model=yolov8m.pt imgsz=640
96
+ ```
97
+ **Training Configuration:**
98
+ - Base Model: YOLOv8 Medium (yolov8m.pt)
99
+ - Epochs: 100
100
+ - Image Size: 640x640
101
+ - Classes: 2 (door, window)
102
+
103
+ ![Training_img 1](https://github.com/user-attachments/assets/91d56bd7-ad51-412a-ac1d-f6519f4fb192)
104
+
105
+ ![Training_img 2](https://github.com/user-attachments/assets/2c7e39c2-62ff-42ed-8f36-8246a1ef6754)
106
+
107
+ ![Training_img 3](https://github.com/user-attachments/assets/334426cf-1189-45cc-a8a0-1fa5e17b7054)
108
+
109
+ ![Training_img 4](https://github.com/user-attachments/assets/2a6b04e2-e7c3-476f-9490-f60725312eb4)
110
+
111
+ ### Step 3: Model Testing
112
+ ```bash
113
+ yolo task=detect mode=predict model=best.pt show=true conf=0.5 source=12.png line_thickness=1
114
+ ```
115
+ ![Testing_img 1](https://github.com/user-attachments/assets/3be7fed0-f8a0-4844-b203-d649fe93144a)
116
+
117
+ ![Testing_img 2](https://github.com/user-attachments/assets/d1069eac-8e16-47c4-88a1-0d9707b81b75)
118
+
119
+ ## πŸ”Œ API Usage
120
+
121
+ ### Main Endpoint
122
+ ```
123
+ POST /predict
124
+ ```
125
+
126
+ ### Parameters
127
+ - **file** (required): Upload PNG or JPG image (max 10MB)
128
+ - **response_type** (required): Choose between `json` or `image`
129
+
130
+ ### Response Modes
131
+
132
+ #### 1. JSON Response (`response_type=json`)
133
+ Returns detection data in JSON format:
134
+
135
+ ```json
136
+ {
137
+ "detections": [
138
+ {
139
+ "label": "door",
140
+ "confidence": 0.91,
141
+ "bbox": [x, y, width, height]
142
+ },
143
+ {
144
+ "label": "window",
145
+ "confidence": 0.84,
146
+ "bbox": [x, y, width, height]
147
+ }
148
+ ]
149
+ }
150
+ ```
151
+
152
+ #### 2. Image Response (`response_type=image`)
153
+ Returns annotated PNG image with:
154
+ - Bounding boxes around detected objects
155
+ - Labels with confidence scores
156
+ - Color-coded detection classes
157
+ - Detection count in response headers
158
+
159
+
160
+ ![12](https://github.com/user-attachments/assets/d17d8988-72fc-4b8d-a254-d16ece3359da)
161
+
162
+ ![17](https://github.com/user-attachments/assets/f63c5263-5cdf-4f52-b0c0-4f86bc07ffff)
163
+
164
+ ![22](https://github.com/user-attachments/assets/362db0f0-cac8-451e-a54f-462e6bbb2c88)
165
+
166
+ ### Usage Examples
167
+
168
+ #### cURL - JSON Response
169
+ ```bash
170
+ curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
171
+ -F "file=@your_blueprint.png" \
172
+ -F "response_type=json"
173
+ ```
174
+
175
+ #### cURL - Image Response
176
+ ```bash
177
+ curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
178
+ -F "file=@your_blueprint.png" \
179
+ -F "response_type=image" \
180
+ --output detected_result.png
181
+ ```
182
+
183
+ #### Python - JSON Response
184
+ ```python
185
+ import requests
186
+
187
+ url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
188
+ files = {"file": open("blueprint.png", "rb")}
189
+ data = {"response_type": "json"}
190
+
191
+ response = requests.post(url, files=files, data=data)
192
+ detections = response.json()["detections"]
193
+ print(f"Found {len(detections)} objects")
194
+ ```
195
+
196
+ #### Python - Image Response
197
+ ```python
198
+ import requests
199
+
200
+ url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
201
+ files = {"file": open("blueprint.png", "rb")}
202
+ data = {"response_type": "image"}
203
+
204
+ response = requests.post(url, files=files, data=data)
205
+ with open("annotated_result.png", "wb") as f:
206
+ f.write(response.content)
207
+ ```
208
+
209
+ ## 🐳 Docker Deployment
210
+
211
+ The application is containerized using Docker:
212
+
213
+ ```dockerfile
214
+ FROM python:3.10-slim
215
+
216
+ ENV PYTHONDONTWRITEBYTECODE=1
217
+ ENV PYTHONUNBUFFERED=1
218
+
219
+ WORKDIR /app
220
+
221
+ # Install system dependencies
222
+ RUN apt-get update && apt-get install -y \
223
+ libglib2.0-0 libgl1-mesa-glx \
224
+ && rm -rf /var/lib/apt/lists/*
225
+
226
+ # Install Python dependencies
227
+ COPY requirements.txt .
228
+ RUN pip install --no-cache-dir -r requirements.txt
229
+
230
+ COPY . .
231
+
232
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
233
+ ```
234
+
235
+ ## πŸ“¦ Dependencies
236
+
237
+ ```txt
238
+ fastapi
239
+ uvicorn
240
+ ultralytics
241
+ opencv-python-headless
242
+ pillow
243
+ torch
244
+ numpy
245
+ python-multipart
246
+ ```
247
+
248
+ ## ⚑ Performance Features
249
+
250
+ - **GPU Acceleration**: Automatically uses CUDA if available with FP16 precision
251
+ - **Model Warmup**: Dummy inference on startup for faster first request
252
+ - **Async Processing**: Non-blocking image processing with ThreadPoolExecutor (2 workers)
253
+ - **Smart Resizing**: Large images automatically resized to max 1280px
254
+ - **Memory Efficient**: Optimized for production deployment
255
+ - **Confidence Thresholding**: Filters low-confidence detections (β‰₯0.5)
256
+ - **IoU Filtering**: Non-maximum suppression with 0.45 threshold
257
+ - **Color Consistency**: Hash-based color generation for detection labels
258
+
259
+ ## πŸ“ Project Structure
260
+
261
+ ```
262
+ door-window-detection/
263
+ β”œβ”€β”€ app.py # FastAPI application
264
+ β”œβ”€β”€ requirements.txt # Python dependencies
265
+ β”œβ”€β”€ Dockerfile # Container configuration
266
+ β”œβ”€β”€ yolov8m_custom.pt # Trained model weights
267
+ β”œβ”€β”€ data_custom.yaml # Training configuration
268
+ β”œβ”€β”€ classes.txt # Class names
269
+ β”œβ”€β”€ datasets/ # Training data
270
+ β”‚ β”œβ”€β”€ images/
271
+ β”‚ └── labels/
272
+ └── README.md # This file
273
+ ```
274
+
275
+ ## πŸ” Model Configuration
276
+
277
+ - **Architecture**: YOLOv8 Medium (yolov8m_custom.pt)
278
+ - **Input Processing**: Auto-resize to max 1280px, maintains aspect ratio
279
+ - **Inference Settings**:
280
+ - Confidence Threshold: 0.5
281
+ - IoU Threshold: 0.45
282
+ - Max Detections: 100
283
+ - Half Precision: Enabled on GPU
284
+ - **Classes**: 2 (door, window)
285
+ - **Training Data**: Custom-labeled blueprint images
286
+
287
+ ## 🎨 Visual Features
288
+
289
+ - **Dynamic Bounding Boxes**: Color-coded by detection class
290
+ - **Confidence Labels**: Shows class name and confidence score
291
+ - **Hash-based Colors**: Consistent colors for each label type
292
+ - **High-Quality Output**: PNG format with preserved image quality
293
+
294
+ ## πŸ”§ API Configuration
295
+
296
+ - **File Size Limit**: 10MB maximum
297
+ - **Supported Formats**: JPG, PNG
298
+ - **Concurrent Processing**: 2 worker threads
299
+ - **Response Headers**: Include detection count metadata
300
+ - **Error Handling**: Comprehensive validation and error messages
301
+
302
+ ## πŸ“ˆ Results & Screenshots
303
+
304
+ ### Training Progress
305
+ - Loss curves and training metrics
306
+ - Model performance on validation set
307
+ - Convergence after 100 epochs
308
+
309
+ #### Confusion Matrix
310
+
311
+ ![confusion_matrix](https://github.com/user-attachments/assets/f75fa379-66d9-4552-b68b-357c96970ffc)
312
+
313
+ #### Confusion Matrix Normalized
314
+
315
+ ![confusion_matrix_normalized](https://github.com/user-attachments/assets/3fe6b1d6-7aa9-4a68-907f-7f3df99846dd)
316
+
317
+ #### Confusion F1 Curve
318
+
319
+ ![F1_curve](https://github.com/user-attachments/assets/ec55acfc-88ca-4358-b901-81055a3dc85d)
320
+
321
+ #### labels
322
+
323
+ ![labels](https://github.com/user-attachments/assets/5f291a1d-5c4f-427b-ab90-f95cf674d18b)
324
+
325
+ #### P_curve
326
+
327
+ ![P_curve](https://github.com/user-attachments/assets/a659451c-8804-4985-b99a-9747df198183)
328
+
329
+ #### PR_Curve
330
+
331
+ ![PR_curve](https://github.com/user-attachments/assets/59fcdca6-5e04-43b3-bd3a-0499fef27279)
332
+
333
+ #### R Curve
334
+
335
+ ![R_curve](https://github.com/user-attachments/assets/1c57de54-c492-465f-a43f-536f87a91add)
336
+
337
+ #### Results
338
+
339
+ ![results](https://github.com/user-attachments/assets/c2c2b540-0ea9-4c7d-8208-ade4f0e7c7de)
340
+
341
+ ### API Responses
342
+
343
+ - JSON detection data examples
344
+
345
+ ![JSON Response](https://github.com/user-attachments/assets/68c5f208-0f5a-4d4a-8f1a-1203d8a2f72f)
346
+
347
+
348
+ - Annotated image outputs
349
+
350
+ ![Image Response](https://github.com/user-attachments/assets/1edc83e8-1f4e-4c62-b8fc-78b3c0091e8c)
351
+
352
+ - Performance benchmarks
353
+
354
+ ### Interactive Documentation
355
+ - Swagger UI at `/docs`
356
+ - Parameter descriptions
357
+ - Live API testing interface
358
+
359
+ ![API Interface](https://github.com/user-attachments/assets/0ea0ada7-b072-4586-b26b-e6e7f2d9b333)
360
+
361
+
362
+ ## 🀝 Contributing
363
+
364
+ 1. Fork the repository
365
+ 2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
366
+ 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
367
+ 4. Push to the branch (`git push origin feature/AmazingFeature`)
368
+ 5. Open a Pull Request
369
+
370
+ ## πŸ“„ License
371
+
372
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
373
+
374
+ ## πŸ™ Acknowledgments
375
+
376
+ - YOLOv8 by Ultralytics
377
+ - FastAPI framework
378
+ - Hugging Face Spaces for deployment
379
+ - LabelImg for annotation tool