Detection#

vision_architectures.metrics.detection.map_mar(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

vision_architectures.metrics.detection.mean_average_precision_recall(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

vision_architectures.metrics.detection.mean_average_precision_mean_average_recall(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

class vision_architectures.metrics.detection.MeanAveragePrecision(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, *args, **kwargs)[source]#

Bases: _MeanAveragePrecisionMeanAverageRecallBase

Calculate the COCO mean average precision (mAP) for object detection.

compute()[source]#

Override this method to compute the final metric value.

This method will automatically synchronize state variables when running in distributed backend.

class vision_architectures.metrics.detection.MeanAverageRecall(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, *args, **kwargs)[source]#

Bases: _MeanAveragePrecisionMeanAverageRecallBase

Calculate the COCO mean average recall (mAR) for object detection.

compute()[source]#

Override this method to compute the final metric value.

This method will automatically synchronize state variables when running in distributed backend.

class vision_architectures.metrics.detection.AveragePrecision(iou_threshold, *args, **kwargs)[source]#

Bases: MeanAveragePrecision

Calculate the COCO average precision (AP) for object detection.

__init__(iou_threshold, *args, **kwargs)[source]#

Initialize the MeanAveragePrecisionMeanAverageRecall metric.

Parameters:
  • num_classes – Number of classes in the dataset.

  • iou_thresholds – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points – Number of points over which to calculate average precision.

  • min_confidence_threshold – Minimum confidence score threshold to consider a prediction.

  • max_bboxes_per_image – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered.

class vision_architectures.metrics.detection.AverageRecall(iou_threshold, *args, **kwargs)[source]#

Bases: MeanAverageRecall

Calculate the COCO average recall (AR) for object detection.

__init__(iou_threshold, *args, **kwargs)[source]#

Initialize the MeanAveragePrecisionMeanAverageRecall metric.

Parameters:
  • num_classes – Number of classes in the dataset.

  • iou_thresholds – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points – Number of points over which to calculate average precision.

  • min_confidence_threshold – Minimum confidence score threshold to consider a prediction.

  • max_bboxes_per_image – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered.