nucleus.metrics¶
Evaluation metrics for comparing predictions against ground truth annotations.
Abstract class for defining a metric, which takes a list of annotations |
|
A scalar result contains the value of an evaluation, as well as its weight. |
- class nucleus.metrics.Metric(annotation_filters=None, prediction_filters=None)¶
Abstract class for defining a metric, which takes a list of annotations and predictions and returns a scalar.
To create a new concrete Metric, override the __call__ function with logic to define a metric between annotations and predictions.
from nucleus import BoxAnnotation, CuboidPrediction, Point3D from nucleus.annotation import AnnotationList from nucleus.prediction import PredictionList from nucleus.metrics import Metric, MetricResult from nucleus.metrics.polygon_utils import BoxOrPolygonAnnotation, BoxOrPolygonPrediction class MyMetric(Metric): def __call__( self, annotations: AnnotationList, predictions: PredictionList ) -> MetricResult: value = (len(annotations) - len(predictions)) ** 2 weight = len(annotations) return MetricResult(value, weight) box = BoxAnnotation( label="car", x=0, y=0, width=10, height=10, reference_id="image_1", annotation_id="image_1_car_box_1", metadata={"vehicle_color": "red"} ) cuboid = CuboidPrediction( label="car", position=Point3D(100, 100, 10), dimensions=Point3D(5, 10, 5), yaw=0, reference_id="pointcloud_1", confidence=0.8, annotation_id="pointcloud_1_car_cuboid_1", metadata={"vehicle_color": "green"} ) metric = MyMetric() annotations = AnnotationList(box_annotations=[box]) predictions = PredictionList(cuboid_predictions=[cuboid]) metric(annotations, predictions)
- Parameters:
annotation_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – Filter predicates. Accepts either ListOfAndFilters (where each Filter forms a chain of AND predicates) or ListOfOrAndFilters (filters in disjunctive normal form). DNF allows arbitrary boolean logical combinations of single field predicates.
prediction_filters (Optional[Union[nucleus.metrics.filtering.ListOfOrAndFilters, nucleus.metrics.filtering.ListOfAndFilters]]) – Filter predicates. Accepts either ListOfAndFilters (where each Filter forms a chain of AND predicates) or ListOfOrAndFilters (filters in disjunctive normal form). DNF allows arbitrary boolean logical combinations of single field predicates.
- abstract aggregate_score(results)¶
A metric must define how to aggregate results from single items to a single ScalarResult.
E.g. to calculate a R2 score with sklearn you could define a custom metric class:
class R2Result(MetricResult): y_true: float y_pred: float
And then define an aggregate_score:
def aggregate_score(self, results: List[MetricResult]) -> ScalarResult: y_trues = [] y_preds = [] for result in results: y_true.append(result.y_true) y_preds.append(result.y_pred) r2_score = sklearn.metrics.r2_score(y_trues, y_preds) return ScalarResult(r2_score)
- Parameters:
results (List[MetricResult])
- Return type:
- abstract call_metric(annotations, predictions)¶
A metric must override this method and return a metric result, given annotations and predictions.
- Parameters:
annotations (nucleus.annotation.AnnotationList)
predictions (nucleus.prediction.PredictionList)
- Return type:
MetricResult
- class nucleus.metrics.ScalarResult¶
A scalar result contains the value of an evaluation, as well as its weight. The weight is useful when aggregating metrics where each dataset item may hold a different relative weight. For example, when calculating precision over a dataset, the denominator of the precision is the number of annotations, and therefore the weight can be set as the number of annotations.
- value¶
The value of the evaluation result
- Type:
float
- weight¶
The weight of the evaluation result.
- Type:
float
- static aggregate(results)¶
Aggregates results using a weighted average.
- Parameters:
results (Iterable[ScalarResult])
- Return type: