Detection#

vision_architectures.metrics.detection.map_mar(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

vision_architectures.metrics.detection.mean_average_precision_recall(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

vision_architectures.metrics.detection.MeanAveragePrecisionRecall[source]#

alias of MeanAveragePrecisionMeanAverageRecall

vision_architectures.metrics.detection.mean_average_precision_mean_average_recall(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes, iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100, return_intermediates=False)[source]#

Calculate the COCO mean average precision (mAP) for object detection.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP, 1+num_classes) containing the predicted confidence scores for each class. Note that the first column corresponds to the “no-object” class, and bounding boxes which fall in this category are ignored.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • target_classes (list[Tensor]) – A list of length B containing tensors of shape (NT,) containing the target class labels for the objects in the image.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence probability threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered. If set to None, all bounding boxes are considered.

  • return_intermediates (bool) – If True, return intermediate values used to calculate mAP and mAR.

Return type:

tuple[float, float] | tuple[float, float, dict[float, dict[int, float]], dict[float, dict[int, float]]]

Returns:

The mean average precision (mAP) and mean average recall (mAR) across all classes and IoU thresholds for the entire dataset. If return_intermediates is True, also returns two dictionaries containing the average precision and average recall for each class at each IoU threshold.

class vision_architectures.metrics.detection.MeanAveragePrecisionMeanAverageRecall(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100)[source]#

Bases: Metric

Calculate the COCO mean average precision (mAP) and mean average recall (mAR) for object detection.

__init__(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100)[source]#

Initialize the MeanAveragePrecisionMeanAverageRecall metric.

Parameters:
  • num_classes – Number of classes in the dataset.

  • iou_thresholds (list[float]) – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points (int) – Number of points over which to calculate average precision.

  • min_confidence_threshold (float) – Minimum confidence score threshold to consider a prediction.

  • max_bboxes_per_image (int | None) – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered.

update(pred_bboxes, pred_confidence_scores, target_bboxes, target_classes)[source]#

Override this method to update the state variables of your metric class.

compute()[source]#

Override this method to compute the final metric value.

This method will automatically synchronize state variables when running in distributed backend.

forward(*args, return_metrics='map_mar', **kwargs)[source]#

Aggregate and evaluate batch input directly.

Serves the dual purpose of both computing the metric on the current batch of inputs but also add the batch statistics to the overall accumulating metric state. Input arguments are the exact same as corresponding update method. The returned output is the exact same as the output of compute.

Parameters:
  • args – Any arguments as required by the metric update method.

  • kwargs – Any keyword arguments as required by the metric update method.

Returns:

The output of the compute method evaluated on the current batch.

Raises:

TorchMetricsUserError – If the metric is already synced and forward is called again.

class vision_architectures.metrics.detection.MeanAveragePrecision(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100)[source]#

Bases: MeanAveragePrecisionMeanAverageRecall

Calculate the COCO mean average precision (mAP) for object detection.

forward(*args, **kwargs)[source]#

Aggregate and evaluate batch input directly.

Serves the dual purpose of both computing the metric on the current batch of inputs but also add the batch statistics to the overall accumulating metric state. Input arguments are the exact same as corresponding update method. The returned output is the exact same as the output of compute.

Parameters:
  • args – Any arguments as required by the metric update method.

  • kwargs – Any keyword arguments as required by the metric update method.

Returns:

The output of the compute method evaluated on the current batch.

Raises:

TorchMetricsUserError – If the metric is already synced and forward is called again.

class vision_architectures.metrics.detection.MeanAverageRecall(iou_thresholds=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], average_precision_num_points=101, min_confidence_threshold=0.0, max_bboxes_per_image=100)[source]#

Bases: MeanAveragePrecisionMeanAverageRecall

Calculate the COCO mean average recall (mAR) for object detection.

forward(*args, **kwargs)[source]#

Aggregate and evaluate batch input directly.

Serves the dual purpose of both computing the metric on the current batch of inputs but also add the batch statistics to the overall accumulating metric state. Input arguments are the exact same as corresponding update method. The returned output is the exact same as the output of compute.

Parameters:
  • args – Any arguments as required by the metric update method.

  • kwargs – Any keyword arguments as required by the metric update method.

Returns:

The output of the compute method evaluated on the current batch.

Raises:

TorchMetricsUserError – If the metric is already synced and forward is called again.

class vision_architectures.metrics.detection.AveragePrecision(iou_threshold, *args, **kwargs)[source]#

Bases: MeanAveragePrecision

Calculate the COCO average precision (AP) for object detection.

__init__(iou_threshold, *args, **kwargs)[source]#

Initialize the MeanAveragePrecisionMeanAverageRecall metric.

Parameters:
  • num_classes – Number of classes in the dataset.

  • iou_thresholds – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points – Number of points over which to calculate average precision.

  • min_confidence_threshold – Minimum confidence score threshold to consider a prediction.

  • max_bboxes_per_image – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered.

class vision_architectures.metrics.detection.AverageRecall(iou_threshold, *args, **kwargs)[source]#

Bases: MeanAverageRecall

Calculate the COCO average recall (AR) for object detection.

__init__(iou_threshold, *args, **kwargs)[source]#

Initialize the MeanAveragePrecisionMeanAverageRecall metric.

Parameters:
  • num_classes – Number of classes in the dataset.

  • iou_thresholds – A list of IoU thresholds to use for calculating mAP and mAR.

  • average_precision_num_points – Number of points over which to calculate average precision.

  • min_confidence_threshold – Minimum confidence score threshold to consider a prediction.

  • max_bboxes_per_image – Maximum number of bounding boxes to consider per image. If more are present, only the top max_bboxes_per_image boxes based on confidence scores are considered.