nucleus.validate.scenario_test¶
Scenario Tests combine collections of data and evaluation metrics to accelerate model evaluation.
With Model CI Scenario Tests, an ML engineer can define a Scenario Test from critical edge case scenarios that the model must get right (e.g. pedestrians at night), and have confidence that they’re always shipping the best model.
A Scenario Test combines a slice and at least one evaluation criterion. A |
- class nucleus.validate.scenario_test.ScenarioTest¶
A Scenario Test combines a slice and at least one evaluation criterion. A
ScenarioTest
is not created through the default constructor but using the instructions shown inValidate
. ThisScenarioTest
class only simplifies the interaction with the scenario tests from this SDK.- id¶
The ID of the scenario test.
- Type:
str
- connection¶
The connection to Nucleus API.
- Type:
Connection
- name¶
The name of the scenario test.
- Type:
str
- slice_id¶
The ID of the associated Nucleus slice.
- Type:
str
- add_eval_function(eval_function)¶
Creates and adds a new evaluation metric to the
ScenarioTest
.import nucleus client = nucleus.NucleusClient("YOUR_SCALE_API_KEY") scenario_test = client.validate.create_scenario_test( "sample_scenario_test", "slc_bx86ea222a6g057x4380" ) e = client.validate.eval_functions # Assuming a user would like to add all available public evaluation functions as criteria scenario_test.add_eval_function( e.bbox_iou ) scenario_test.add_eval_function( e.bbox_map ) scenario_test.add_eval_function( e.bbox_precision ) scenario_test.add_eval_function( e.bbox_recall )
- Parameters:
eval_function (nucleus.validate.eval_functions.available_eval_functions.EvalFunction) –
EvalFunction
- Raises:
NucleusAPIError – By adding this function, the scenario test mixes external with non-external functions which is not permitted.
- Returns:
The created ScenarioTestMetric object.
- Return type:
nucleus.validate.scenario_test_metric.ScenarioTestMetric
- get_eval_functions()¶
Retrieves all criteria of the
ScenarioTest
.import nucleus client = nucleus.NucleusClient("YOUR_SCALE_API_KEY") scenario_test = client.validate.scenario_tests[0] scenario_test.get_eval_functions()
- Returns:
A list of ScenarioTestMetric objects.
- Return type:
List[nucleus.validate.scenario_test_metric.ScenarioTestMetric]
- get_eval_history()¶
Retrieves evaluation history for
ScenarioTest
.import nucleus client = nucleus.NucleusClient("YOUR_SCALE_API_KEY") scenario_test = client.validate.scenario_tests[0] scenario_test.get_eval_history()
- Returns:
A list of
ScenarioTestEvaluation
objects.- Return type:
List[nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluation]
- get_items(level=EntityLevel.ITEM)¶
Gets items within a scenario test at a given level, returning a list of Track, DatasetItem, or Scene objects.
- Parameters:
level (nucleus.validate.constants.EntityLevel) –
EntityLevel
- Returns:
A list of
ScenarioTestEvaluation
objects.- Return type:
Union[List[nucleus.track.Track], List[nucleus.dataset_item.DatasetItem], List[nucleus.scene.Scene]]
- set_baseline_model(model_id)¶
Sets a new baseline model for the ScenarioTest. In order to be eligible to be a baseline, this scenario test must have been evaluated using that model. The baseline model’s performance is used as the threshold for all metrics against which other models are compared.
import nucleus client = nucleus.NucleusClient(“YOUR_SCALE_API_KEY”) scenario_test = client.validate.scenario_tests[0]
scenario_test.set_baseline_model(‘my_baseline_model_id’)
- Returns:
A list of
ScenarioTestEvaluation
objects.- Parameters:
model_id (str)