nucleus.metrics.segmentation_metrics

SegmentationFWAVACC

Calculates the frequency weighted average of the class-wise Jaccard index

SegmentationIOU

Abstract class for defining a metric, which takes a list of annotations

SegmentationMAP

Calculates the mean average precision per class for segmentation masks

SegmentationMaskMetric

Abstract class for defining a metric, which takes a list of annotations

SegmentationPrecision

Abstract class for defining a metric, which takes a list of annotations

SegmentationRecall

Calculates the recall for a segmentation mask

class nucleus.metrics.segmentation_metrics.SegmentationFWAVACC(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)

Calculates the frequency weighted average of the class-wise Jaccard index

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonRecall

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonRecall()
metric(annotations, predictions)

Initializes SegmentationFWAVACC object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float)

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.segmentation_metrics.SegmentationIOU(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonIOU object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float)

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.segmentation_metrics.SegmentationMAP(annotation_filters=None, prediction_filters=None, iou_thresholds='coco')

Calculates the mean average precision per class for segmentation masks

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonMAP

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonMAP()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • map_thresholds – Provide a list of threshold to compute over or literal “coco”

  • iou_thresholds (Union[List[float], str])

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.segmentation_metrics.SegmentationMaskMetric(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonMetric abstract object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float)

abstract aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[MetricResult])

Return type:

ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.segmentation_metrics.SegmentationPrecision(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Calculates mean per-class precision

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float)

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.segmentation_metrics.SegmentationRecall(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)

Calculates the recall for a segmentation mask

Initializes PolygonRecall object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float)

aggregate_score(results)

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult])

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.