nucleus.metrics.polygon_metrics

PolygonAveragePrecision

Calculates the average precision between box or polygon annotations and predictions.

PolygonIOU

Calculates the average IOU between box or polygon annotations and predictions.

PolygonMAP

Calculates the mean average precision between box or polygon annotations and predictions.

PolygonMetric

Abstract class for metrics of box and polygons.

PolygonPrecision

Calculates the precision between box or polygon annotations and predictions.

PolygonRecall

Calculates the recall between box or polygon annotations and predictions.

class nucleus.metrics.polygon_metrics.PolygonAveragePrecision(label, iou_threshold=0.5, annotation_filters=None, prediction_filters=None)

Calculates the average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonAveragePrecision

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonAveragePrecision(label="car")
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.polygon_metrics.PolygonIOU(enforce_label_match=False, iou_threshold=0.0, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)

Calculates the average IOU between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonIOU

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonIOU()
metric(annotations, predictions)

Initializes PolygonIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.polygon_metrics.PolygonMAP(iou_threshold=0.5, annotation_filters=None, prediction_filters=None)

Calculates the mean average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonMAP

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonMAP()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.polygon_metrics.PolygonMetric(enforce_label_match=False, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)

Abstract class for metrics of box and polygons.

The PolygonMetric class automatically filters incoming annotations and predictions for only box and polygon annotations. It also filters predictions whose confidence is less than the provided confidence_threshold. Finally, it provides support for enforcing matching labels. If enforce_label_match is set to True, then annotations and predictions will only be matched if they have the same label.

To create a new concrete PolygonMetric, override the eval function with logic to define a metric between box/polygon annotations and predictions.

from typing import List
from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import ScalarResult, PolygonMetric
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyPolygonMetric(PolygonMetric):
    def eval(
        self,
        annotations: List[BoxOrPolygonAnnotation],
        predictions: List[BoxOrPolygonPrediction],
    ) -> ScalarResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return ScalarResult(value, weight)

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = MyPolygonMetric()
metric(annotations, predictions)

Initializes PolygonMetric abstract object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Default False

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.polygon_metrics.PolygonPrecision(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)

Calculates the precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonPrecision

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonPrecision()
metric(annotations, predictions)

Initializes PolygonPrecision object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.polygon_metrics.PolygonRecall(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)

Calculates the recall between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonRecall

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonRecall()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult