π DETR + Keypoint Estimation (COCO Subset)
Author: @Koushik
π§ Model Overview
This project combines:
- π€ facebook/detr-resnet-50 (object detector)
- π§± Custom PyTorch keypoint head
- π Trained on 500-person subset of COCO 2017 Keypoints
The system detects people using DETR, then predicts 17 COCO-style keypoints (top-down) using heatmap regression.
π Files Included
File | Description |
---|---|
pytorch_model.bin |
Trained PyTorch model weights |
05_detr_pose_coco_colab.ipynb |
Full Colab notebook (training + inference) |
config.json |
Basic model metadata |
README.md |
Project description |
π Dataset
- Subset: 500 images from COCO val2017 with visible persons
- Annotations: 17 keypoints per person
- Source: COCO Keypoints
ποΈ Architecture
[ Input Image ]
β
βΌ
[ DETR (Person BBox) ]
β
βΌ
[ Crop + Resize (256Γ256) ]
β
βΌ
[ CNN Keypoint Head ]
β
βΌ
[ 17 Heatmaps (Keypoints) ]
π Quick Start
import torch
from model import KeypointHead
model = KeypointHead()
model.load_state_dict(torch.load('pytorch_model.bin'))
model.eval()
π§ͺ Inference Demo
from PIL import Image
import cv2, numpy as np
from transformers import DetrImageProcessor, DetrForObjectDetection
img = Image.open('sample.jpg')
processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
detector = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50")
inputs = processor(images=img, return_tensors="pt")
outputs = detector(**inputs)
results = processor.post_process_object_detection(outputs, target_sizes=[img.size[::-1]], threshold=0.8)[0]
# Use results['boxes'][0] to crop person
# Feed crop into model(img) to get 17 heatmaps
π§ Training (optional)
To fine-tune on your own dataset:
- Convert your data to COCO format
- Use the notebook provided (
05_detr_pose_coco_colab.ipynb
) - Change paths and re-train
β¨ Credit
- Downloads last month
- 23
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Evaluation results
- Heatmap MSE on COCO 2017 (50-person subset)self-reported~0.02