Spaces:

hf-vision
/

detection_metrics

Running

App Files Files Community

detection_metrics / README.md

rapadilla

first commit

a52e8a5 almost 2 years ago

preview code

raw

history blame contribute delete

6.95 kB

	---
	title: Detection Metrics
	emoji: 📈
	colorFrom: green
	colorTo: indigo
	sdk: static
	app_file: README.md
	pinned: true
	---

	![alt text](https://huggingface.co/spaces/rafaelpadilla/detection_metrics/resolve/main/assets/metrics_small.png)

	This project implements object detection Average Precision metrics using COCO style.

	With `Detection Metrics` you can easily compute all 12 COCO metrics given the bounding boxes output by your object detection model:

	### Average Precision (AP):
	1. AP: AP at IoU=.50:.05:.95
	2. AP<sup>IoU=.50</sup>: AP at IoU=.50 (similar to mAP PASCAL VOC metric)
	3. AP<sup>IoU=.75%</sup>: AP at IoU=.75 (strict metric)

	### AP Across Scales:
	4. AP<sup>small</sup>: AP for small objects: area < 322
	5. AP<sup>medium</sup>: AP for medium objects: 322 < area < 962
	6. AP<sup>large</sup>: AP for large objects: area > 962

	### Average Recall (AR):
	7. AR<sup>max=1</sup>: AR given 1 detection per image
	8. AR<sup>max=10</sup>: AR given 10 detections per image
	9. AR<sup>max=100</sup>: AR given 100 detections per image

	### AR Across Scales:
	10. AR<sup>small</sup>: AR for small objects: area < 322
	11. AR<sup>medium</sup>: AR for medium objects: 322 < area < 962
	12. AR<sup>large</sup>: AR for large objects: area > 962

	## How to use detection metrics?

	Basically, you just need to create your ground-truth data and prepare your evaluation loop to output the boxes, confidences and classes in the required format. Follow these steps:


	### Step 1: Prepare your ground-truth dataset

	Convert your ground-truth annotations in JSON following the COCO format.
	COCO ground-truth annotations are represented in a dictionary containing 3 elements: "images", "annotations" and "categories".
	The snippet below shows an example of the dictionary, and you can find [here](https://towardsdatascience.com/how-to-work-with-object-detection-datasets-in-coco-format-9bf4fb5848a4).

	```
	{
	"images": [
	{
	"id": 212226,
	"width": 500,
	"height": 335
	},
	...
	],
	"annotations": [
	{
	"id": 489885,
	"category_id": 1,
	"iscrowd": 0,
	"image_id": 212226,
	"area": 12836,
	"bbox": [
	235.6300048828125, # x
	84.30999755859375, # y
	158.08999633789062, # w
	185.9499969482422 # h
	]
	},
	....
	],
	"categories": [
	{
	"supercategory": "none",
	"id": 1,
	"name": "person"
	},
	...
	]
	}
	```
	You do not need to save the JSON in disk, you can keep it in memory as a dictionary.

	### Step 2: Load the object detection evaluator:

	Install Hugging Face's `Evaluate` module (`pip install evaluate`) to load the evaluator. More instructions [here](https://huggingface.co/docs/evaluate/installation).

	Load the object detection evaluator passing the JSON created on the previous step through the argument `json_gt`:
	`evaluator = evaluate.load("rafaelpadilla/detection_metrics", json_gt=ground_truth_annotations, iou_type="bbox")`

	### Step 3: Loop through your dataset samples to obtain the predictions:

	```python
	# Loop through your dataset
	for batch in dataloader_train:

	# Get the image(s) from the batch
	images = batch["images"]
	# Get the image ids of the image
	image_ids = batch["image_ids"]

	# Pass the image(s) to your model to obtain bounding boxes, scores and labels
	predictions = model.predict_boxes(images)
	# Pass the predictions and image id to the evaluator
	evaluator.add(prediction=predictions, reference=image_ids)

	# Call compute to obtain your results
	results = evaluator.compute()
	print(results)

	```

	Regardless your model's architecture, your predictions must be converted to a dictionary containing 3 fields as shown below:

	```python
	predictions: [
	{
	"scores": [0.55, 0.95, 0.87],
	"labels": [6, 1, 1],
	"boxes": [[100, 30, 40, 28], [40, 32, 50, 28], [128, 44, 23, 69]]
	},
	...
	]
	```
	* `scores`: List or torch tensor containing the confidences of your detections. A confidence is a value between 0 and 1.
	* `labels`: List or torch tensor with the indexes representing the labels of your detections.
	* `boxes`: List or torch tensors with the detected bounding boxes in the format `x,y,w,h`.

	The `reference` added to the evaluator in each loop is represented by a list of dictionaries containing the image id of the image in that batch.

	For example, in a batch containing two images, with ids 508101 and 1853, the `reference` argument must receive `image_ids` in the following format:

	```python
	image_ids = [ {'image_id': [508101]}, {'image_id': [1853]} ]
	```

	After the loop, you have to call `evaluator.compute()` to obtain your results in the format of a dictionary. The metrics can also be seen in the prompt as:

	```
	IoU metric: bbox
	Average Precision (AP) @[ IoU=0.50:0.95 \| area= all \| maxDets=100 ] = 0.415
	Average Precision (AP) @[ IoU=0.50 \| area= all \| maxDets=100 ] = 0.613
	Average Precision (AP) @[ IoU=0.75 \| area= all \| maxDets=100 ] = 0.436
	Average Precision (AP) @[ IoU=0.50:0.95 \| area= small \| maxDets=100 ] = 0.209
	Average Precision (AP) @[ IoU=0.50:0.95 \| area=medium \| maxDets=100 ] = 0.449
	Average Precision (AP) @[ IoU=0.50:0.95 \| area= large \| maxDets=100 ] = 0.601
	Average Recall (AR) @[ IoU=0.50:0.95 \| area= all \| maxDets= 1 ] = 0.333
	Average Recall (AR) @[ IoU=0.50:0.95 \| area= all \| maxDets= 10 ] = 0.531
	Average Recall (AR) @[ IoU=0.50:0.95 \| area= all \| maxDets=100 ] = 0.572
	Average Recall (AR) @[ IoU=0.50:0.95 \| area= small \| maxDets=100 ] = 0.321
	Average Recall (AR) @[ IoU=0.50:0.95 \| area=medium \| maxDets=100 ] = 0.624
	Average Recall (AR) @[ IoU=0.50:0.95 \| area= large \| maxDets=100 ] = 0.794
	```

	The scheme below illustrates how your `for` loop should look like:

	![alt text](https://huggingface.co/spaces/rafaelpadilla/detection_metrics/resolve/main/assets/scheme_coco_evaluate.png)

	-----------------------

	## References and further readings:

	1. [COCO Evaluation Metrics](https://cocodataset.org/#detection-eval)
	2. [A Survey on performance metrics for object-detection algorithms](https://www.researchgate.net/profile/Rafael-Padilla/publication/343194514_A_Survey_on_Performance_Metrics_for_Object-Detection_Algorithms/links/5f1b5a5e45851515ef478268/A-Survey-on-Performance-Metrics-for-Object-Detection-Algorithms.pdf)
	3. [A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit](https://www.mdpi.com/2079-9292/10/3/279/pdf)
	4. [COCO ground-truth annotations for your datasets in JSON](https://towardsdatascience.com/how-to-work-with-object-detection-datasets-in-coco-format-9bf4fb5848a4)