kurakula-Prashanth2004's picture
Update README.md
e5b428d verified
metadata
title: Door Window Detection
emoji: 😻
colorFrom: pink
colorTo: indigo
sdk: docker
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Door & Window Detection using YOLOv8

A custom-trained YOLOv8 model for detecting doors and windows in construction blueprint-style images, deployed as a FastAPI service with dual response modes.

πŸš€ Demo

Live API: https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection

GitHub Repository: https://github.com/kurakula-prashanth/door-window-detection

πŸ“‹ Project Overview

This project implements a complete machine learning pipeline for detecting doors and windows in architectural blueprints:

  1. Manual Data Labeling - Created custom dataset with bounding box annotations
  2. Model Training - Trained YOLOv8 model from scratch using only custom-labeled data
  3. API Development - Built FastAPI service with dual response modes (JSON + annotated images)
  4. Deployment - Deployed to Hugging Face Spaces with Docker

🎯 Classes Detected

  • door - Door symbols in blueprints
  • window - Window symbols in blueprints

✨ Key Features

  • Dual Response Modes: Get JSON data or annotated images
  • Interactive Swagger UI: Built-in API documentation at /docs
  • Smart Image Processing: Automatic resizing for large images (max 1280px)
  • GPU Acceleration: CUDA support with FP16 precision
  • Async Processing: Non-blocking inference with ThreadPoolExecutor
  • Dynamic Color Coding: Consistent colors for each detection class
  • Confidence Filtering: Configurable confidence thresholds (default: 0.5)

πŸ› οΈ Setup & Installation

Local Development

  1. Clone the repository
git clone https://github.com/kurakula-prashanth/door-window-detection.git
cd door-window-detection
  1. Create virtual environment
python3.12 -m venv yolo8_custom
source yolo8_custom/bin/activate  # On Windows: yolo8_custom\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Run the API locally
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
  1. Access the API

πŸ“Š Training Process

Step 1: Data Labeling

  • Used LabelImg for manual annotation
  • Labeled 15-20 construction blueprint images
  • Created bounding boxes for doors and windows only
  • Generated YOLO format labels (.txt files)

Labeling image using labelImg - 1

Labeling image using labelImg - 2

Labeling image using labelImg - 3

Labeling image using labelImg - 4

Step 2: Model Training

yolo task=detect mode=train epochs=100 data=data_custom.yaml model=yolov8m.pt imgsz=640

Training Configuration:

  • Base Model: YOLOv8 Medium (yolov8m.pt)
  • Epochs: 100
  • Image Size: 640x640
  • Classes: 2 (door, window)

Training_img 1

Training_img 2

Training_img 3

Training_img 4

Step 3: Model Testing

yolo task=detect mode=predict model=best.pt show=true conf=0.5 source=12.png line_thickness=1

Testing_img 1

Testing_img 2

πŸ”Œ API Usage

Main Endpoint

POST /predict

Parameters

  • file (required): Upload PNG or JPG image (max 10MB)
  • response_type (required): Choose between json or image

Response Modes

1. JSON Response (response_type=json)

Returns detection data in JSON format:

{
  "detections": [
    {
      "label": "door",
      "confidence": 0.91,
      "bbox": [x, y, width, height]
    },
    {
      "label": "window", 
      "confidence": 0.84,
      "bbox": [x, y, width, height]
    }
  ]
}

2. Image Response (response_type=image)

Returns annotated PNG image with:

  • Bounding boxes around detected objects
  • Labels with confidence scores
  • Color-coded detection classes
  • Detection count in response headers

12

17

22

Usage Examples

cURL - JSON Response

curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
     -F "file=@your_blueprint.png" \
     -F "response_type=json"

cURL - Image Response

curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
     -F "file=@your_blueprint.png" \
     -F "response_type=image" \
     --output detected_result.png

Python - JSON Response

import requests

url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "json"}

response = requests.post(url, files=files, data=data)
detections = response.json()["detections"]
print(f"Found {len(detections)} objects")

Python - Image Response

import requests

url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "image"}

response = requests.post(url, files=files, data=data)
with open("annotated_result.png", "wb") as f:
    f.write(response.content)

🐳 Docker Deployment

The application is containerized using Docker:

FROM python:3.10-slim

ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    libglib2.0-0 libgl1-mesa-glx \
 && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

πŸ“¦ Dependencies

fastapi
uvicorn
ultralytics
opencv-python-headless
pillow
torch
numpy
python-multipart

⚑ Performance Features

  • GPU Acceleration: Automatically uses CUDA if available with FP16 precision
  • Model Warmup: Dummy inference on startup for faster first request
  • Async Processing: Non-blocking image processing with ThreadPoolExecutor (2 workers)
  • Smart Resizing: Large images automatically resized to max 1280px
  • Memory Efficient: Optimized for production deployment
  • Confidence Thresholding: Filters low-confidence detections (β‰₯0.5)
  • IoU Filtering: Non-maximum suppression with 0.45 threshold
  • Color Consistency: Hash-based color generation for detection labels

πŸ“ Project Structure

door-window-detection/
β”œβ”€β”€ app.py                 # FastAPI application
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ Dockerfile            # Container configuration
β”œβ”€β”€ yolov8m_custom.pt     # Trained model weights
β”œβ”€β”€ data_custom.yaml      # Training configuration
β”œβ”€β”€ classes.txt           # Class names
β”œβ”€β”€ datasets/             # Training data
β”‚   β”œβ”€β”€ images/
β”‚   └── labels/
└── README.md            # This file

πŸ” Model Configuration

  • Architecture: YOLOv8 Medium (yolov8m_custom.pt)
  • Input Processing: Auto-resize to max 1280px, maintains aspect ratio
  • Inference Settings:
    • Confidence Threshold: 0.5
    • IoU Threshold: 0.45
    • Max Detections: 100
    • Half Precision: Enabled on GPU
  • Classes: 2 (door, window)
  • Training Data: Custom-labeled blueprint images

🎨 Visual Features

  • Dynamic Bounding Boxes: Color-coded by detection class
  • Confidence Labels: Shows class name and confidence score
  • Hash-based Colors: Consistent colors for each label type
  • High-Quality Output: PNG format with preserved image quality

πŸ”§ API Configuration

  • File Size Limit: 10MB maximum
  • Supported Formats: JPG, PNG
  • Concurrent Processing: 2 worker threads
  • Response Headers: Include detection count metadata
  • Error Handling: Comprehensive validation and error messages

πŸ“ˆ Results & Screenshots

Training Progress

  • Loss curves and training metrics
  • Model performance on validation set
  • Convergence after 100 epochs

Confusion Matrix

confusion_matrix

Confusion Matrix Normalized

confusion_matrix_normalized

Confusion F1 Curve

F1_curve

labels

labels

P_curve

P_curve

PR_Curve

PR_curve

R Curve

R_curve

Results

results

API Responses

  • JSON detection data examples

JSON Response

  • Annotated image outputs

Image Response

  • Performance benchmarks

Interactive Documentation

  • Swagger UI at /docs
  • Parameter descriptions
  • Live API testing interface

API Interface

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • YOLOv8 by Ultralytics
  • FastAPI framework
  • Hugging Face Spaces for deployment
  • LabelImg for annotation tool