nucleus.metrics#

CategorizationF1

Evaluation method that matches categories and returns a CategorizationF1Result that aggregates to the F1 score

CuboidIOU

Calculates the average IOU between cuboid annotations and predictions.

CuboidPrecision

Calculates the average precision between cuboid annotations and predictions.

CuboidRecall

Calculates the average recall between cuboid annotations and predictions.

FieldFilter

Filter on standard field of AnnotationTypes or PredictionTypes

MetadataFilter

Filter on customer provided metadata associated with AnnotationTypes or PredictionTypes

Metric

Abstract class for defining a metric, which takes a list of annotations

PolygonAveragePrecision

Calculates the average precision between box or polygon annotations and predictions.

PolygonIOU

Calculates the average IOU between box or polygon annotations and predictions.

PolygonMAP

Calculates the mean average precision between box or polygon annotations and predictions.

PolygonMetric

Abstract class for metrics of box and polygons.

PolygonPrecision

Calculates the precision between box or polygon annotations and predictions.

PolygonRecall

Calculates the recall between box or polygon annotations and predictions.

ScalarResult

A scalar result contains the value of an evaluation, as well as its weight.

SegmentFieldFilter

Filter on standard field of Segment(s) of SegmentationAnnotation and SegmentationPrediction

SegmentMetadataFilter

Filter on customer provided metadata associated with Segments of a SegmentationAnnotation or

SegmentationFWAVACC

Calculates the frequency weighted average of the class-wise Jaccard index

SegmentationIOU

Abstract class for defining a metric, which takes a list of annotations

SegmentationMAP

Calculates the mean average precision per class for segmentation masks

SegmentationMaskMetric

Abstract class for defining a metric, which takes a list of annotations

SegmentationMaskToPolyMetric

Abstract class for defining a metric, which takes a list of annotations

SegmentationPrecision

Abstract class for defining a metric, which takes a list of annotations

SegmentationRecall

Calculates the recall for a segmentation mask

SegmentationToPolyAveragePrecision

Calculates the average precision between box or polygon annotations and predictions.

SegmentationToPolyIOU

Abstract class for defining a metric, which takes a list of annotations

SegmentationToPolyMAP

Calculates the mean average precision between box or polygon annotations and predictions.

SegmentationToPolyPrecision

Abstract class for defining a metric, which takes a list of annotations

SegmentationToPolyRecall

Calculates the recall between box or polygon annotations and predictions.

class nucleus.metrics.CategorizationF1(confidence_threshold=0.0, f1_method='macro', annotation_filters=None, prediction_filters=None)#

Evaluation method that matches categories and returns a CategorizationF1Result that aggregates to the F1 score

Parameters:
  • confidence_threshold (float) – minimum confidence threshold for predictions to be taken into account for evaluation. Must be in [0, 1]. Default 0.0

  • f1_method (str) – {‘micro’, ‘macro’, ‘samples’,’weighted’, ‘binary’}, default=’macro’

  • targets. (This parameter is required for multiclass/multilabel) –

  • None (If) –

  • Otherwise (the scores for each class are returned.) –

  • this

  • data (determines the type of averaging performed on the) –

  • 'binary' – Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

  • 'micro' – Calculate metrics globally by counting the total true positives, false negatives and false positives.

  • 'macro' – Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

  • 'weighted' – Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

  • 'samples' – Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[CategorizationResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

CategorizationResult

eval(annotations, predictions)#

Notes: This is a little weird eval function. It essentially only does matching of annotation to label and the actual metric computation happens in the aggregate step since F1 score only makes sense on a collection.

Parameters:
Return type:

CategorizationResult

class nucleus.metrics.CuboidIOU(enforce_label_match=True, iou_threshold=0.0, confidence_threshold=0.0, iou_2d=False, annotation_filters=None, prediction_filters=None)#

Calculates the average IOU between cuboid annotations and predictions.

Initializes CuboidIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to True

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • iou_2d (bool) – whether to return the BEV 2D IOU if true, or the 3D IOU if false.

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – MetadataFilter predicates. Predicates are expressed in disjunctive normal form (DNF), like [[MetadataFilter(‘x’, ‘=’, 0), …], …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple column predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – MetadataFilter predicates. Predicates are expressed in disjunctive normal form (DNF), like [[MetadataFilter(‘x’, ‘=’, 0), …], …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple column predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.CuboidPrecision(enforce_label_match=True, iou_threshold=0.0, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Calculates the average precision between cuboid annotations and predictions.

Initializes CuboidIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to True

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – MetadataFilter predicates. Predicates are expressed in disjunctive normal form (DNF), like [[MetadataFilter(‘x’, ‘==’, 0), …], …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple column predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – MetadataFilter predicates. Predicates are expressed in disjunctive normal form (DNF), like [[MetadataFilter(‘x’, ‘==’, 0), …], …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple column predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.CuboidRecall(enforce_label_match=True, iou_threshold=0.0, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Calculates the average recall between cuboid annotations and predictions.

Initializes CuboidIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to True

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.FieldFilter#

Filter on standard field of AnnotationTypes or PredictionTypes

Examples

FieldFilter(“x”, “>”, 10) would pass every BoxAnnotation with x attribute larger than 10 FieldFilter(“label”, “in”, [“car”, “truck”]) would pass every BoxAnnotation with label

in [“car”, “truck”]

key#

key to compare with value

op#

FilterOp or one of [“>”, “>=”, “<”, “<=”, “=”, “==”, “!=”, “in”, “not in”] to define comparison with value field

value#

bool, str, float or int to compare the field with key or list of the same values for ‘in’ and ‘not in’ ops

allow_missing#

Allow missing field values. Will REMOVE the object with the missing field from the selection

type#

DO NOT USE. Internal type for serialization over the wire. Changing this will change the NamedTuple type as well.

class nucleus.metrics.MetadataFilter#

Filter on customer provided metadata associated with AnnotationTypes or PredictionTypes

key#

key to compare with value

op#

FilterOp or one of [“>”, “>=”, “<”, “<=”, “=”, “==”, “!=”, “in”, “not in”] to define comparison with value field

value#

bool, str, float or int to compare the field with key or list of the same values for ‘in’ and ‘not in’ ops

allow_missing#

Allow missing metadata values. Will REMOVE the object with the missing field from the selection

type#

DO NOT USE. Internal type for serialization over the wire. Changing this will change the NamedTuple type as well.

class nucleus.metrics.Metric(annotation_filters=None, prediction_filters=None)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)
Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

abstract aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[MetricResult]) –

Return type:

ScalarResult

abstract call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

MetricResult

class nucleus.metrics.PolygonAveragePrecision(label, iou_threshold=0.5, annotation_filters=None, prediction_filters=None)#

Calculates the average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonAveragePrecision

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonAveragePrecision(label="car")
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.PolygonIOU(enforce_label_match=False, iou_threshold=0.0, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Calculates the average IOU between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonIOU

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonIOU()
metric(annotations, predictions)

Initializes PolygonIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.PolygonMAP(iou_threshold=0.5, annotation_filters=None, prediction_filters=None)#

Calculates the mean average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonMAP

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonMAP()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.PolygonMetric(enforce_label_match=False, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Abstract class for metrics of box and polygons.

The PolygonMetric class automatically filters incoming annotations and predictions for only box and polygon annotations. It also filters predictions whose confidence is less than the provided confidence_threshold. Finally, it provides support for enforcing matching labels. If enforce_label_match is set to True, then annotations and predictions will only be matched if they have the same label.

To create a new concrete PolygonMetric, override the eval function with logic to define a metric between box/polygon annotations and predictions.

from typing import List
from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import ScalarResult, PolygonMetric
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyPolygonMetric(PolygonMetric):
    def eval(
        self,
        annotations: List[BoxOrPolygonAnnotation],
        predictions: List[BoxOrPolygonPrediction],
    ) -> ScalarResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return ScalarResult(value, weight)

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = MyPolygonMetric()
metric(annotations, predictions)

Initializes PolygonMetric abstract object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Default False

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.PolygonPrecision(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Calculates the precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonPrecision

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonPrecision()
metric(annotations, predictions)

Initializes PolygonPrecision object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.PolygonRecall(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None)#

Calculates the recall between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonRecall

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonRecall()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.ScalarResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

class nucleus.metrics.ScalarResult#

A scalar result contains the value of an evaluation, as well as its weight. The weight is useful when aggregating metrics where each dataset item may hold a different relative weight. For example, when calculating precision over a dataset, the denominator of the precision is the number of annotations, and therefore the weight can be set as the number of annotations.

value#

The value of the evaluation result

Type:

float

weight#

The weight of the evaluation result.

Type:

float

static aggregate(results)#

Aggregates results using a weighted average.

Parameters:

results (Iterable[ScalarResult]) –

Return type:

ScalarResult

class nucleus.metrics.SegmentFieldFilter#

Filter on standard field of Segment(s) of SegmentationAnnotation and SegmentationPrediction

Examples

SegmentFieldFilter(“label”, “in”, [“grass”, “tree”]) would pass every Segment of a

SegmentationAnnotation or :class:`SegmentationPrediction

key#

key to compare with value

op#

FilterOp or one of [“>”, “>=”, “<”, “<=”, “=”, “==”, “!=”, “in”, “not in”] to define comparison with value field

value#

bool, str, float or int to compare the field with key or list of the same values for ‘in’ and ‘not in’ ops

allow_missing#

Allow missing field values. Will REMOVE the object with the missing field from the selection

type#

DO NOT USE. Internal type for serialization over the wire. Changing this will change the NamedTuple type as well.

class nucleus.metrics.SegmentMetadataFilter#

Filter on customer provided metadata associated with Segments of a SegmentationAnnotation or SegmentationPrediction

key#

key to compare with value

op#

FilterOp or one of [“>”, “>=”, “<”, “<=”, “=”, “==”, “!=”, “in”, “not in”] to define comparison with value field

value#

bool, str, float or int to compare the field with key or list of the same values for ‘in’ and ‘not in’ ops

allow_missing#

Allow missing metadata values. Will REMOVE the object with the missing field from the selection

type#

DO NOT USE. Internal type for serialization over the wire. Changing this will change the NamedTuple type as well.

class nucleus.metrics.SegmentationFWAVACC(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)#

Calculates the frequency weighted average of the class-wise Jaccard index

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonRecall

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonRecall()
metric(annotations, predictions)

Initializes SegmentationFWAVACC object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationIOU(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonIOU object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationMAP(annotation_filters=None, prediction_filters=None, iou_thresholds='coco')#

Calculates the mean average precision per class for segmentation masks

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonMAP

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonMAP()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • map_thresholds – Provide a list of threshold to compute over or literal “coco”

  • iou_thresholds (Union[List[float], str]) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationMaskMetric(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonMetric abstract object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

abstract aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[MetricResult]) –

Return type:

ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationMaskToPolyMetric(enforce_label_match=False, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonMetric abstract object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Default False

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult

class nucleus.metrics.SegmentationPrecision(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Calculates mean per-class precision

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationRecall(annotation_filters=None, prediction_filters=None, iou_threshold=0.5)#

Calculates the recall for a segmentation mask

Initializes PolygonRecall object.

Parameters:
  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.ScalarResult

get_mask_channel(ann_or_pred)#

Some annotations are stored as RGB instead of L (single-channel). We expect the image to be faux-single-channel with all the channels repeating so we choose the first one.

class nucleus.metrics.SegmentationToPolyAveragePrecision(label, iou_threshold=0.5, annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Calculates the average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonAveragePrecision

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonAveragePrecision(label="car")
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult

class nucleus.metrics.SegmentationToPolyIOU(enforce_label_match=False, iou_threshold=0.0, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes PolygonIOU object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.0

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult

class nucleus.metrics.SegmentationToPolyMAP(iou_threshold=-1, iou_thresholds='coco', annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Calculates the mean average precision between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonMAP

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonMAP()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • iou_thresholds (Union[List[float], str]) – IOU thresholds to check AP at

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • iou_threshold (float) –

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult

class nucleus.metrics.SegmentationToPolyPrecision(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Initializes SegmentationToPolyPrecision object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult

class nucleus.metrics.SegmentationToPolyRecall(enforce_label_match=False, iou_threshold=0.5, confidence_threshold=0.0, annotation_filters=None, prediction_filters=None, mode=SegToPolyMode.GENERATE_GT_FROM_POLY)#

Calculates the recall between box or polygon annotations and predictions.

from nucleus import BoxAnnotation, Point, PolygonPrediction
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import PolygonRecall

box_anno = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

polygon_pred = PolygonPrediction(
    label="bus",
    vertices=[Point(100, 100), Point(150, 200), Point(200, 100)],
    reference_id="image_2",
    annotation_id="image_2_bus_polygon_1",
    confidence=0.8,
    metadata={"vehicle_color": "yellow"}
)

annotations = AnnotationList(box_annotations=[box_anno])
predictions = PredictionList(polygon_predictions=[polygon_pred])
metric = PolygonRecall()
metric(annotations, predictions)

Initializes PolygonRecall object.

Parameters:
  • enforce_label_match (bool) – whether to enforce that annotation and prediction labels must match. Defaults to False

  • iou_threshold (float) – IOU threshold to consider detection as valid. Must be in [0, 1]. Default 0.5

  • confidence_threshold (float) – minimum confidence threshold for predictions. Must be in [0, 1]. Default 0.0

  • annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) –

    Filter predicates. Allowed formats are: ListOfAndFilters where each Filter forms a chain of AND predicates.

    or

    ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like [[MetadataFilter(“short_haired”, “==”, True), FieldFilter(“label”, “in”, [“cat”, “dog”]), …]. DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures each describe a single column predicate. The list of inner predicates is interpreted as a conjunction (AND), forming a more selective and multiple field predicate. Finally, the most outer list combines these filters as a disjunction (OR).

  • mode (SegToPolyMode) –

aggregate_score(results)#

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)
Parameters:

results (List[nucleus.metrics.base.MetricResult]) –

Return type:

nucleus.metrics.base.ScalarResult

call_metric(annotations, predictions)#

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:
Return type:

nucleus.metrics.base.MetricResult