Bounding Boxes#

vision_architectures.utils.bounding_boxes.sort_by_first_column_descending(tensor)[source]#

Helper function to sort a tensor in descending order based on values in first column

Return type:

Tensor

vision_architectures.utils.bounding_boxes.get_tps_fps_fns(pred_bboxes, pred_confidence_scores, target_bboxes, iou_threshold, matching_method='coco', min_confidence_threshold=0.0, max_bboxes_per_image=None, return_intermediate_counts=False)[source]#

Given predicted and target bounding boxes, their confidence scores, and an IOU threshold, get a matching of true positives, and a set of false positives and false negatives.

Parameters:
  • pred_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NP, 4) or (NP, 6) containing the predicted bounding box parameters in xyxy or xyzxyz format.

  • pred_confidence_scores (list[Tensor]) – A list of length B containing tensors of shape (NP,) containing the predicted confidence scores for the corresponding bounding boxes.

  • target_bboxes (list[Tensor]) – A list of length B containing tensors of shape (NT, 4) or (NT, 6) containing the target bounding box parameters in xyxy or xyzxyz format.

  • iou_threshold (float) – The IOU threshold above which a predicted box is considered a match for a target box.

  • matching_method (Literal['coco', 'hungarian']) – The method to use for matching predicted boxes to target boxes. ‘coco’ implements the greedy matching algorithm used in the COCO dataset. ‘hungarian’ implements the Hungarian algorithm for optimal matching. Note that ‘hungarian’ is more computationally expensive and may not scale well to large numbers of boxes.

  • min_confidence_threshold (float) – Minimum confidence score for a predicted box to be considered for matching.

  • max_bboxes_per_image (Optional[int]) – If not None, consider only the top K predicted boxes per image based on confidence scores.

  • return_intermediate_counts (bool) – Whether to return intermediate counts of true positives, false positives and false negatives after each prediction is considered. Useful for plotting precision-recall curves.

Return type:

tuple[set[tuple[int, int, int]], set[tuple[int, int]], set[tuple[int, int]]] | tuple[set, set, set, list[tuple[int, int, int]]]

Returns:

The first set contains tuples of (b, p, t) where b is the batch index, p is the index of the predicted box and t is the index of the matched target box. The second set contains tuples of (b, p) where b is the batch index and p is the index of the false positive predicted box. The third set contains tuples of (b, t) where b is the batch index and t is the index of the false negative target box. If return_intermediate_counts is True, also returns a list of tuples of (TP, FP, FN) counts after each prediction.