nucleus.validate.scenario_test_evaluation

Data types for Scenario Test Evaluation results.

ScenarioTestEvaluation

The results and attributes of an evaluation of a scenario test.

ScenarioTestEvaluationStatus

The Job status of scenario test evaluation.

ScenarioTestItemEvaluation

Dataset item-level results of an evaluation of a scenario test.

class nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluation

The results and attributes of an evaluation of a scenario test.

id

The ID of this scenario test evaluation.

Type:

str

scenario_test_id

The ID of the associated scenario test.

Type:

str

eval_function_id

The ID of the associated evaluation function.

Type:

str

model_id

The ID of the associated model.

Type:

str

status

The status of the evaluation job.

Type:

str

result

The float result of the evaluation.

Type:

Optional[float]

passed

Whether the scenario test was passed.

Type:

bool

item_evals

The individual results for each dataset item.

Type:

List[ScenarioTestItemEvaluation]

Return type:

List[ScenarioTestItemEvaluation]

connection

The connection to the Nucleus API.

Type:

Connection

class nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluationStatus

The Job status of scenario test evaluation.

class nucleus.validate.scenario_test_evaluation.ScenarioTestItemEvaluation

Dataset item-level results of an evaluation of a scenario test. Note that this class is immutable.

evaluation_id

The ID of the associated scenario test evaluation

Type:

str

scenario_test_id

The ID of the associated scenario test.

Type:

str

eval_function_id

The ID of the associated evaluation function.

Type:

str

dataset_item_id

The ID of the dataset item of this evaluation.

Type:

str

result

The numerical result of the evaluation on this item.

Type:

Optional[float]

passed

Whether the result was sufficient to pass the test for this item.

Type:

bool