dacapo.experiments.tasks.evaluators.evaluator
Attributes
Classes
Base class of all evaluators: An abstract class representing an evaluator that compares and evaluates the output array against the evaluation array. |
Module Contents
- dacapo.experiments.tasks.evaluators.evaluator.OutputIdentifier
- dacapo.experiments.tasks.evaluators.evaluator.Iteration
- dacapo.experiments.tasks.evaluators.evaluator.Score
- dacapo.experiments.tasks.evaluators.evaluator.BestScore
- class dacapo.experiments.tasks.evaluators.evaluator.Evaluator
Base class of all evaluators: An abstract class representing an evaluator that compares and evaluates the output array against the evaluation array.
An evaluator takes a post-processor’s output and compares it against ground-truth. It then returns a set of scores that can be used to determine the quality of the post-processor’s output.
- best_scores
Dict[OutputIdentifier, BestScore] the best scores for each dataset/post-processing parameter/criterion combination
- evaluate(output_array_identifier, evaluation_array)
Compare and evaluate the output array against the evaluation array.
- is_best(dataset, parameter, criterion, score)
Check if the provided score is the best for this dataset/parameter/criterion combo.
- get_overall_best(dataset, criterion)
Return the best score for the given dataset and criterion.
- get_overall_best_parameters(dataset, criterion)
Return the best parameters for the given dataset and criterion.
- compare(score_1, score_2, criterion)
Compare two scores for the given criterion.
- set_best(validation_scores)
Find the best iteration for each dataset/post_processing_parameter/criterion.
- higher_is_better(criterion)
Return whether higher is better for the given criterion.
- bounds(criterion)
Return the bounds for the given criterion.
- store_best(criterion)
Return whether to store the best score for the given criterion.
Note
The Evaluator class is used to compare and evaluate the output array against the evaluation array.
- abstract evaluate(output_array_identifier: dacapo.store.local_array_store.LocalArrayIdentifier, evaluation_array: dacapo.experiments.datasplits.datasets.arrays.Array) dacapo.experiments.tasks.evaluators.evaluation_scores.EvaluationScores
Compares and evaluates the output array against the evaluation array.
- Parameters:
output_array_identifier – LocalArrayIdentifier The identifier of the output array.
evaluation_array – Array The evaluation array.
- Returns:
- EvaluationScores
The evaluation scores.
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> output_array_identifier = LocalArrayIdentifier("output_array") >>> evaluation_array = Array() >>> evaluator.evaluate(output_array_identifier, evaluation_array) EvaluationScores()
Note
This function is used to compare and evaluate the output array against the evaluation array.
- property best_scores: Dict[OutputIdentifier, BestScore]
The best scores for each dataset/post-processing parameter/criterion combination.
- Returns:
- Dict[OutputIdentifier, BestScore]
the best scores for each dataset/post-processing parameter/criterion combination
- Raises:
AttributeError – if the best scores are not set
Examples
>>> evaluator = Evaluator() >>> evaluator.best_scores {}
Note
This function is used to return the best scores for each dataset/post-processing parameter/criterion combination.
- is_best(dataset: dacapo.experiments.datasplits.datasets.Dataset, parameter: dacapo.experiments.tasks.post_processors.PostProcessorParameters, criterion: str, score: dacapo.experiments.tasks.evaluators.evaluation_scores.EvaluationScores) bool
Check if the provided score is the best for this dataset/parameter/criterion combo.
- Parameters:
dataset – Dataset the dataset
parameter – PostProcessorParameters the post-processor parameters
criterion – str the criterion
score – EvaluationScores the evaluation scores
- Returns:
- bool
whether the provided score is the best for this dataset/parameter/criterion combo
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> dataset = Dataset() >>> parameter = PostProcessorParameters() >>> criterion = "criterion" >>> score = EvaluationScores() >>> evaluator.is_best(dataset, parameter, criterion, score) False
Note
This function is used to check if the provided score is the best for this dataset/parameter/criterion combo.
- get_overall_best(dataset: dacapo.experiments.datasplits.datasets.Dataset, criterion: str)
Return the best score for the given dataset and criterion.
- Parameters:
dataset – Dataset the dataset
criterion – str the criterion
- Returns:
- Optional[float]
the best score for the given dataset and criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> dataset = Dataset() >>> criterion = "criterion" >>> evaluator.get_overall_best(dataset, criterion) None
Note
This function is used to return the best score for the given dataset and criterion.
- get_overall_best_parameters(dataset: dacapo.experiments.datasplits.datasets.Dataset, criterion: str)
Return the best parameters for the given dataset and criterion.
- Parameters:
dataset – Dataset the dataset
criterion – str the criterion
- Returns:
- Optional[PostProcessorParameters]
the best parameters for the given dataset and criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> dataset = Dataset() >>> criterion = "criterion" >>> evaluator.get_overall_best_parameters(dataset, criterion) None
Note
This function is used to return the best parameters for the given dataset and criterion.
- compare(score_1, score_2, criterion)
Compare two scores for the given criterion.
- Parameters:
score_1 – float the first score
score_2 – float the second score
criterion – str the criterion
- Returns:
- bool
whether the first score is better than the second score for the given criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> score_1 = 0.0 >>> score_2 = 0.0 >>> criterion = "criterion" >>> evaluator.compare(score_1, score_2, criterion) False
Note
This function is used to compare two scores for the given criterion.
- set_best(validation_scores: dacapo.experiments.validation_scores.ValidationScores) None
Find the best iteration for each dataset/post_processing_parameter/criterion.
- Parameters:
validation_scores – ValidationScores the validation scores
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> validation_scores = ValidationScores() >>> evaluator.set_best(validation_scores) None
Note
This function is used to find the best iteration for each dataset/post_processing_parameter/criterion. Typically, this function is called after the validation scores have been computed.
- property criteria: List[str]
- Abstractmethod:
A list of all criteria for which a model might be “best”. i.e. your criteria might be “precision”, “recall”, and “jaccard”. It is unlikely that the best iteration/post processing parameters will be the same for all 3 of these criteria
- Returns:
- List[str]
the evaluation criteria
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> evaluator.criteria []
Note
This function is used to return the evaluation criteria.
- higher_is_better(criterion: str) bool
Wether or not higher is better for this criterion.
- Parameters:
criterion – str the criterion
- Returns:
- bool
whether higher is better for the given criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> criterion = "criterion" >>> evaluator.higher_is_better(criterion) False
Note
This function is used to determine whether higher is better for the given criterion.
- bounds(criterion: str) Tuple[int | float | None, int | float | None]
The bounds for this criterion
- Parameters:
criterion – str the criterion
- Returns:
- Tuple[Union[int, float, None], Union[int, float, None]]
the bounds for the given criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> criterion = "criterion" >>> evaluator.bounds(criterion) (0, 1)
Note
This function is used to return the bounds for the given criterion.
- store_best(criterion: str) bool
The bounds for this criterion
- Parameters:
criterion – str the criterion
- Returns:
- bool
whether to store the best score for the given criterion
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> criterion = "criterion" >>> evaluator.store_best(criterion) False
Note
This function is used to return whether to store the best score for the given criterion.
- property score: dacapo.experiments.tasks.evaluators.evaluation_scores.EvaluationScores
- Abstractmethod:
The evaluation scores.
- Returns:
- EvaluationScores
the evaluation scores
- Raises:
NotImplementedError – if the function is not implemented
Examples
>>> evaluator = Evaluator() >>> evaluator.score EvaluationScores()
Note
This function is used to return the evaluation scores.