metadata

title: Door Window Detection
emoji: 😻
colorFrom: pink
colorTo: indigo
sdk: docker
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Door & Window Detection using YOLOv8

A custom-trained YOLOv8 model for detecting doors and windows in construction blueprint-style images, deployed as a FastAPI service with dual response modes.

🚀 Demo

Live API: https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection

GitHub Repository: https://github.com/kurakula-prashanth/door-window-detection

📋 Project Overview

This project implements a complete machine learning pipeline for detecting doors and windows in architectural blueprints:

Manual Data Labeling - Created custom dataset with bounding box annotations
Model Training - Trained YOLOv8 model from scratch using only custom-labeled data
API Development - Built FastAPI service with dual response modes (JSON + annotated images)
Deployment - Deployed to Hugging Face Spaces with Docker

🎯 Classes Detected

door - Door symbols in blueprints
window - Window symbols in blueprints

✨ Key Features

Dual Response Modes: Get JSON data or annotated images
Interactive Swagger UI: Built-in API documentation at /docs
Smart Image Processing: Automatic resizing for large images (max 1280px)
GPU Acceleration: CUDA support with FP16 precision
Async Processing: Non-blocking inference with ThreadPoolExecutor
Dynamic Color Coding: Consistent colors for each detection class
Confidence Filtering: Configurable confidence thresholds (default: 0.5)

🛠️ Setup & Installation

Local Development

Clone the repository

git clone https://github.com/kurakula-prashanth/door-window-detection.git
cd door-window-detection

Create virtual environment

python3.12 -m venv yolo8_custom
source yolo8_custom/bin/activate  # On Windows: yolo8_custom\Scripts\activate

Install dependencies

pip install -r requirements.txt

Run the API locally

uvicorn app:app --host 0.0.0.0 --port 8000 --reload

Access the API

Interactive Documentation: http://localhost:8000/docs
API Endpoint: http://localhost:8000/predict

📊 Training Process

Step 1: Data Labeling

Used LabelImg for manual annotation
Labeled 15-20 construction blueprint images
Created bounding boxes for doors and windows only
Generated YOLO format labels (.txt files)

Step 2: Model Training

yolo task=detect mode=train epochs=100 data=data_custom.yaml model=yolov8m.pt imgsz=640

Training Configuration:

Base Model: YOLOv8 Medium (yolov8m.pt)
Epochs: 100
Image Size: 640x640
Classes: 2 (door, window)

Step 3: Model Testing

yolo task=detect mode=predict model=best.pt show=true conf=0.5 source=12.png line_thickness=1

🔌 API Usage

Main Endpoint

POST /predict

Parameters

file (required): Upload PNG or JPG image (max 10MB)
response_type (required): Choose between json or image

Response Modes

1. JSON Response (`response_type=json`)

Returns detection data in JSON format:

{
  "detections": [
    {
      "label": "door",
      "confidence": 0.91,
      "bbox": [x, y, width, height]
    },
    {
      "label": "window", 
      "confidence": 0.84,
      "bbox": [x, y, width, height]
    }
  ]
}

2. Image Response (`response_type=image`)

Returns annotated PNG image with:

Bounding boxes around detected objects
Labels with confidence scores
Color-coded detection classes
Detection count in response headers

Usage Examples

cURL - JSON Response

curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
     -F "file=@your_blueprint.png" \
     -F "response_type=json"

cURL - Image Response

curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
     -F "file=@your_blueprint.png" \
     -F "response_type=image" \
     --output detected_result.png

Python - JSON Response

import requests

url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "json"}

response = requests.post(url, files=files, data=data)
detections = response.json()["detections"]
print(f"Found {len(detections)} objects")

Python - Image Response

import requests

url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "image"}

response = requests.post(url, files=files, data=data)
with open("annotated_result.png", "wb") as f:
    f.write(response.content)

🐳 Docker Deployment

The application is containerized using Docker:

FROM python:3.10-slim

ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    libglib2.0-0 libgl1-mesa-glx \
 && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

📦 Dependencies

fastapi
uvicorn
ultralytics
opencv-python-headless
pillow
torch
numpy
python-multipart

⚡ Performance Features

GPU Acceleration: Automatically uses CUDA if available with FP16 precision
Model Warmup: Dummy inference on startup for faster first request
Async Processing: Non-blocking image processing with ThreadPoolExecutor (2 workers)
Smart Resizing: Large images automatically resized to max 1280px
Memory Efficient: Optimized for production deployment
Confidence Thresholding: Filters low-confidence detections (≥0.5)
IoU Filtering: Non-maximum suppression with 0.45 threshold
Color Consistency: Hash-based color generation for detection labels

📁 Project Structure

door-window-detection/
├── app.py                 # FastAPI application
├── requirements.txt       # Python dependencies
├── Dockerfile            # Container configuration
├── yolov8m_custom.pt     # Trained model weights
├── data_custom.yaml      # Training configuration
├── classes.txt           # Class names
├── datasets/             # Training data
│   ├── images/
│   └── labels/
└── README.md            # This file

🔍 Model Configuration

Architecture: YOLOv8 Medium (yolov8m_custom.pt)
Input Processing: Auto-resize to max 1280px, maintains aspect ratio
Inference Settings:
- Confidence Threshold: 0.5
- IoU Threshold: 0.45
- Max Detections: 100
- Half Precision: Enabled on GPU
Classes: 2 (door, window)
Training Data: Custom-labeled blueprint images

🎨 Visual Features

Dynamic Bounding Boxes: Color-coded by detection class
Confidence Labels: Shows class name and confidence score
Hash-based Colors: Consistent colors for each label type
High-Quality Output: PNG format with preserved image quality

🔧 API Configuration

File Size Limit: 10MB maximum
Supported Formats: JPG, PNG
Concurrent Processing: 2 worker threads
Response Headers: Include detection count metadata
Error Handling: Comprehensive validation and error messages

📈 Results & Screenshots

Training Progress

Loss curves and training metrics
Model performance on validation set
Convergence after 100 epochs

Confusion Matrix

Confusion Matrix Normalized

Confusion F1 Curve

labels

P_curve

PR_Curve

R Curve

Results

API Responses

JSON detection data examples

Annotated image outputs

Performance benchmarks

Interactive Documentation

Swagger UI at /docs
Parameter descriptions
Live API testing interface

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

YOLOv8 by Ultralytics
FastAPI framework
Hugging Face Spaces for deployment
LabelImg for annotation tool