title: Door Window Detection
emoji: π»
colorFrom: pink
colorTo: indigo
sdk: docker
pinned: false
license: mit
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Door & Window Detection using YOLOv8
A custom-trained YOLOv8 model for detecting doors and windows in construction blueprint-style images, deployed as a FastAPI service with dual response modes.
π Demo
Live API: https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection
GitHub Repository: https://github.com/kurakula-prashanth/door-window-detection
π Project Overview
This project implements a complete machine learning pipeline for detecting doors and windows in architectural blueprints:
- Manual Data Labeling - Created custom dataset with bounding box annotations
- Model Training - Trained YOLOv8 model from scratch using only custom-labeled data
- API Development - Built FastAPI service with dual response modes (JSON + annotated images)
- Deployment - Deployed to Hugging Face Spaces with Docker
π― Classes Detected
door
- Door symbols in blueprintswindow
- Window symbols in blueprints
β¨ Key Features
- Dual Response Modes: Get JSON data or annotated images
- Interactive Swagger UI: Built-in API documentation at
/docs
- Smart Image Processing: Automatic resizing for large images (max 1280px)
- GPU Acceleration: CUDA support with FP16 precision
- Async Processing: Non-blocking inference with ThreadPoolExecutor
- Dynamic Color Coding: Consistent colors for each detection class
- Confidence Filtering: Configurable confidence thresholds (default: 0.5)
π οΈ Setup & Installation
Local Development
- Clone the repository
git clone https://github.com/kurakula-prashanth/door-window-detection.git
cd door-window-detection
- Create virtual environment
python3.12 -m venv yolo8_custom
source yolo8_custom/bin/activate # On Windows: yolo8_custom\Scripts\activate
- Install dependencies
pip install -r requirements.txt
- Run the API locally
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
- Access the API
- Interactive Documentation: http://localhost:8000/docs
- API Endpoint: http://localhost:8000/predict
π Training Process
Step 1: Data Labeling
- Used LabelImg for manual annotation
- Labeled 15-20 construction blueprint images
- Created bounding boxes for doors and windows only
- Generated YOLO format labels (.txt files)
Step 2: Model Training
yolo task=detect mode=train epochs=100 data=data_custom.yaml model=yolov8m.pt imgsz=640
Training Configuration:
- Base Model: YOLOv8 Medium (yolov8m.pt)
- Epochs: 100
- Image Size: 640x640
- Classes: 2 (door, window)
Step 3: Model Testing
yolo task=detect mode=predict model=best.pt show=true conf=0.5 source=12.png line_thickness=1
π API Usage
Main Endpoint
POST /predict
Parameters
- file (required): Upload PNG or JPG image (max 10MB)
- response_type (required): Choose between
json
orimage
Response Modes
1. JSON Response (response_type=json
)
Returns detection data in JSON format:
{
"detections": [
{
"label": "door",
"confidence": 0.91,
"bbox": [x, y, width, height]
},
{
"label": "window",
"confidence": 0.84,
"bbox": [x, y, width, height]
}
]
}
2. Image Response (response_type=image
)
Returns annotated PNG image with:
- Bounding boxes around detected objects
- Labels with confidence scores
- Color-coded detection classes
- Detection count in response headers
Usage Examples
cURL - JSON Response
curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
-F "file=@your_blueprint.png" \
-F "response_type=json"
cURL - Image Response
curl -X POST "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict" \
-F "file=@your_blueprint.png" \
-F "response_type=image" \
--output detected_result.png
Python - JSON Response
import requests
url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "json"}
response = requests.post(url, files=files, data=data)
detections = response.json()["detections"]
print(f"Found {len(detections)} objects")
Python - Image Response
import requests
url = "https://huggingface.co/spaces/kurakula-Prashanth2004/door-window-detection/predict"
files = {"file": open("blueprint.png", "rb")}
data = {"response_type": "image"}
response = requests.post(url, files=files, data=data)
with open("annotated_result.png", "wb") as f:
f.write(response.content)
π³ Docker Deployment
The application is containerized using Docker:
FROM python:3.10-slim
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
libglib2.0-0 libgl1-mesa-glx \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
π¦ Dependencies
fastapi
uvicorn
ultralytics
opencv-python-headless
pillow
torch
numpy
python-multipart
β‘ Performance Features
- GPU Acceleration: Automatically uses CUDA if available with FP16 precision
- Model Warmup: Dummy inference on startup for faster first request
- Async Processing: Non-blocking image processing with ThreadPoolExecutor (2 workers)
- Smart Resizing: Large images automatically resized to max 1280px
- Memory Efficient: Optimized for production deployment
- Confidence Thresholding: Filters low-confidence detections (β₯0.5)
- IoU Filtering: Non-maximum suppression with 0.45 threshold
- Color Consistency: Hash-based color generation for detection labels
π Project Structure
door-window-detection/
βββ app.py # FastAPI application
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ yolov8m_custom.pt # Trained model weights
βββ data_custom.yaml # Training configuration
βββ classes.txt # Class names
βββ datasets/ # Training data
β βββ images/
β βββ labels/
βββ README.md # This file
π Model Configuration
- Architecture: YOLOv8 Medium (yolov8m_custom.pt)
- Input Processing: Auto-resize to max 1280px, maintains aspect ratio
- Inference Settings:
- Confidence Threshold: 0.5
- IoU Threshold: 0.45
- Max Detections: 100
- Half Precision: Enabled on GPU
- Classes: 2 (door, window)
- Training Data: Custom-labeled blueprint images
π¨ Visual Features
- Dynamic Bounding Boxes: Color-coded by detection class
- Confidence Labels: Shows class name and confidence score
- Hash-based Colors: Consistent colors for each label type
- High-Quality Output: PNG format with preserved image quality
π§ API Configuration
- File Size Limit: 10MB maximum
- Supported Formats: JPG, PNG
- Concurrent Processing: 2 worker threads
- Response Headers: Include detection count metadata
- Error Handling: Comprehensive validation and error messages
π Results & Screenshots
Training Progress
- Loss curves and training metrics
- Model performance on validation set
- Convergence after 100 epochs
Confusion Matrix
Confusion Matrix Normalized
Confusion F1 Curve
labels
P_curve
PR_Curve
R Curve
Results
API Responses
- JSON detection data examples
- Annotated image outputs
- Performance benchmarks
Interactive Documentation
- Swagger UI at
/docs
- Parameter descriptions
- Live API testing interface
π€ Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- YOLOv8 by Ultralytics
- FastAPI framework
- Hugging Face Spaces for deployment
- LabelImg for annotation tool