nucleus.metrics¶

Evaluation metrics for comparing predictions against ground truth annotations.

`Metric`	Abstract class for defining a metric, which takes a list of annotations
`ScalarResult`	A scalar result contains the value of an evaluation, as well as its weight.

class nucleus.metrics.Metric(annotation_filters=None, prediction_filters=None)¶

Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.

To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.

from nucleus import BoxAnnotation, CuboidPrediction, Point3D
from nucleus.annotation import AnnotationList
from nucleus.prediction import PredictionList
from nucleus.metrics import Metric, MetricResult
from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction

class MyMetric(Metric):
    def __call__(
        self, annotations: AnnotationList, predictions: PredictionList
    ) -> MetricResult:
        value = (len(annotations) - len(predictions)) ** 2
        weight = len(annotations)
        return MetricResult(value, weight)

box = BoxAnnotation(
    label="car",
    x=0,
    y=0,
    width=10,
    height=10,
    reference_id="image_1",
    annotation_id="image_1_car_box_1",
    metadata={"vehicle_color": "red"}
)

cuboid = CuboidPrediction(
    label="car",
    position=Point3D(100, 100, 10),
    dimensions=Point3D(5, 10, 5),
    yaw=0,
    reference_id="pointcloud_1",
    confidence=0.8,
    annotation_id="pointcloud_1_car_cuboid_1",
    metadata={"vehicle_color": "green"}
)

metric = MyMetric()
annotations = AnnotationList(box_annotations=[box])
predictions = PredictionList(cuboid_predictions=[cuboid])
metric(annotations, predictions)

Parameters:

annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – Filter predicates. Accepts either ListOfAndFilters (where each Filter forms a chain of AND predicates) or ListOfOrAndFilters (filters in disjunctive normal form). DNF allows arbitrary boolean logical combinations of single field predicates.
prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – Filter predicates. Accepts either ListOfAndFilters (where each Filter forms a chain of AND predicates) or ListOfOrAndFilters (filters in disjunctive normal form). DNF allows arbitrary boolean logical combinations of single field predicates.

abstract aggregate_score(results)¶

A metric must define how to aggregate results from single items to a single ScalarResult.

E.g. to calculate a R2 score with sklearn you could define a custom metric class:

class R2Result(MetricResult):
    y_true: float
    y_pred: float

And then define an aggregate_score:

def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
    y_trues = []
    y_preds = []
    for result in results:
        y_true.append(result.y_true)
        y_preds.append(result.y_pred)
    r2_score = sklearn.metrics.r2_score(y_trues, y_preds)
    return ScalarResult(r2_score)

Parameters:: results (List[MetricResult])
Return type:: ScalarResult

abstract call_metric(annotations, predictions)¶

A metric must override this method and return a metric result, given annotations and predictions.

Parameters:

annotations (nucleus.annotation.AnnotationList)
predictions (nucleus.prediction.PredictionList)

Return type:

MetricResult

class nucleus.metrics.ScalarResult¶

A scalar result contains the value of an evaluation, as well as its weight. The weight is useful when aggregating metrics where each dataset item may hold a different relative weight. For example, when calculating precision over a dataset, the denominator of the precision is the number of annotations, and therefore the weight can be set as the number of annotations.

value¶

The value of the evaluation result

Type:: float

weight¶

The weight of the evaluation result.

Type:: float

static aggregate(results)¶

Aggregates results using a weighted average.

Parameters:: results (Iterable[ScalarResult])
Return type:: ScalarResult