nucleus.validate.scenario_test

Scenario Tests combine collections of data and evaluation metrics to accelerate model evaluation.

With Model CI Scenario Tests, an ML engineer can define a Scenario Test from critical edge case scenarios that the model must get right (e.g. pedestrians at night), and have confidence that they’re always shipping the best model.

ScenarioTest

A Scenario Test combines a slice and at least one evaluation criterion. A ScenarioTest is not created through

class nucleus.validate.scenario_test.ScenarioTest

A Scenario Test combines a slice and at least one evaluation criterion. A ScenarioTest is not created through the default constructor but using the instructions shown in Validate. This ScenarioTest class only simplifies the interaction with the scenario tests from this SDK.

id

The ID of the scenario test.

Type:

str

connection

The connection to Nucleus API.

Type:

Connection

name

The name of the scenario test.

Type:

str

slice_id

The ID of the associated Nucleus slice.

Type:

str

add_eval_function(eval_function)

Creates and adds a new evaluation metric to the ScenarioTest.

import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
scenario_test = client.validate.create_scenario_test(
    "sample_scenario_test", "slc_bx86ea222a6g057x4380"
)

e = client.validate.eval_functions
# Assuming a user would like to add all available public evaluation functions as criteria
scenario_test.add_eval_function(
    e.bbox_iou
)
scenario_test.add_eval_function(
    e.bbox_map
)
scenario_test.add_eval_function(
    e.bbox_precision
)
scenario_test.add_eval_function(
    e.bbox_recall
)
Parameters:

eval_function (nucleus.validate.eval_functions.available_eval_functions.EvalFunction) – EvalFunction

Raises:

NucleusAPIError – By adding this function, the scenario test mixes external with non-external functions which is not permitted.

Returns:

The created ScenarioTestMetric object.

Return type:

nucleus.validate.scenario_test_metric.ScenarioTestMetric

get_eval_functions()

Retrieves all criteria of the ScenarioTest.

import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
scenario_test = client.validate.scenario_tests[0]

scenario_test.get_eval_functions()
Returns:

A list of ScenarioTestMetric objects.

Return type:

List[nucleus.validate.scenario_test_metric.ScenarioTestMetric]

get_eval_history()

Retrieves evaluation history for ScenarioTest.

import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
scenario_test = client.validate.scenario_tests[0]

scenario_test.get_eval_history()
Returns:

A list of ScenarioTestEvaluation objects.

Return type:

List[nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluation]

get_items(level=EntityLevel.ITEM)

Gets items within a scenario test at a given level, returning a list of Track, DatasetItem, or Scene objects.

Parameters:

level (nucleus.validate.constants.EntityLevel) – EntityLevel

Returns:

A list of ScenarioTestEvaluation objects.

Return type:

Union[List[nucleus.track.Track], List[nucleus.dataset_item.DatasetItem], List[nucleus.scene.Scene]]

set_baseline_model(model_id)

Sets a new baseline model for the ScenarioTest. In order to be eligible to be a baseline, this scenario test must have been evaluated using that model. The baseline model’s performance is used as the threshold for all metrics against which other models are compared.

import nucleus client = nucleus.NucleusClient(“YOUR_SCALE_API_KEY”) scenario_test = client.validate.scenario_tests[0]

scenario_test.set_baseline_model(‘my_baseline_model_id’)

Returns:

A list of ScenarioTestEvaluation objects.

Parameters:

model_id (str)