dacapo.experiments
Subpackages
Submodules
Classes
A trainable DaCapo model. Consists of an |
|
A class to represent a configuration of a run that helps to structure all the tasks, |
|
A class to represent the training iteration statistics. It contains the loss and time taken for each iteration. |
|
A class used to represent Training Statistics. It contains a list of training |
|
A class used to represent the validation iteration scores in an organized structure. |
|
Class representing the validation scores for a set of parameters and datasets. |
Package Contents
- class dacapo.experiments.Model(architecture: dacapo.experiments.architectures.architecture.Architecture, prediction_head: torch.nn.Module, eval_activation: torch.nn.Module | None = None)
A trainable DaCapo model. Consists of an
Architectureand a prediction head. Models are generated by ``Predictor``s.May include an optional eval_activation that is only executed when the model is in eval mode. This is particularly useful if you want to train with something like BCELossWithLogits, since you want to avoid applying softmax while training, but apply it during evaluation.
- architecture
The architecture of the model.
- Type:
- prediction_head
The prediction head of the model.
- Type:
torch.nn.Module
- chain
The architecture followed by the prediction head.
- Type:
torch.nn.Sequential
- num_in_channels
The number of input channels.
- Type:
int
- input_shape
The shape of the input tensor.
- Type:
Coordinate
- eval_input_shape
The shape of the input tensor during evaluation.
- Type:
Coordinate
- num_out_channels
The number of output channels.
- Type:
int
- output_shape
The shape of the output
- Type:
Coordinate
- eval_activation
The activation function to apply during evaluation.
- Type:
torch.nn.Module | None
- forward(x
torch.Tensor) -> torch.Tensor: Forward pass of the model.
- compute_output_shape(input_shape
Coordinate) -> Tuple[int, Coordinate]: Compute the spatial shape of this model, when fed a tensor of the given spatial shape as input.
- scale(voxel_size
Coordinate) -> Coordinate: Scale the model by the given voxel size.
Note
The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.
- num_out_channels: int
- num_in_channels: int
- architecture
- prediction_head
- chain
- input_shape
- eval_input_shape
- eval_activation
- forward(x)
Forward pass of the model.
- Parameters:
x (torch.Tensor) – The input tensor.
- Returns:
The output tensor.
- Return type:
torch.Tensor
Examples
>>> model = Model(architecture, prediction_head) >>> model.forward(x) torch.Tensor
Note
The eval_activation is only applied during evaluation. This is particularly useful if you want to train with something like BCELossWithLogits, since you want to avoid applying softmax while training, but apply it during evaluation.
- compute_output_shape(input_shape: funlib.geometry.Coordinate) Tuple[int, funlib.geometry.Coordinate]
Compute the spatial shape (i.e., not accounting for channels and batch dimensions) of this model, when fed a tensor of the given spatial shape as input.
- Parameters:
input_shape (Coordinate) – The shape of the input tensor.
- Returns:
The number of output channels and the spatial shape of the output.
- Return type:
Tuple[int, Coordinate]
- Raises:
AssertionError – If the input_shape is not a Coordinate.
Examples
>>> model = Model(architecture, prediction_head) >>> model.compute_output_shape(input_shape) (1, Coordinate(1, 1, 1))
Note
The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.
- scale(voxel_size: funlib.geometry.Coordinate) funlib.geometry.Coordinate
Scale the model by the given voxel size.
- Parameters:
voxel_size (Coordinate) – The voxel size to scale the model by.
- Returns:
The scaled model.
- Return type:
Coordinate
- Raises:
AssertionError – If the voxel_size is not a Coordinate.
Examples
>>> model = Model(architecture, prediction_head) >>> model.scale(voxel_size) Coordinate(1, 1, 1)
Note
The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.
- class dacapo.experiments.RunConfig
A class to represent a configuration of a run that helps to structure all the tasks, architecture, training, and datasplit configurations.
…
Attributes:
- task_config: TaskConfig
A config defining the Task to run that includes deciding the output of the model and different methods to achieve the goal.
- architecture_config: ArchitectureConfig
A config that defines the backbone architecture of the model. It impacts the model’s performance significantly.
- trainer_config: TrainerConfig
Defines how batches are generated and passed for training the model along with defining configurations like batch size, learning rate, number of cpu workers and snapshot logging.
- datasplit_config: DataSplitConfig
Configures the data available for the model during training or validation phases.
- name: str
A unique name for this run to distinguish it.
- repetition: int
The repetition number of this run.
- num_iterations: int
The total number of iterations to train for during this run.
- validation_interval: int
Specifies how often to perform validation during the run. It defaults to 1000.
- start_configOptional[StartConfig]
A starting point for continued training. It is optional and can be left out.
- task_config: dacapo.experiments.tasks.TaskConfig
- architecture_config: dacapo.experiments.architectures.ArchitectureConfig
- trainer_config: dacapo.experiments.trainers.TrainerConfig
- datasplit_config: dacapo.experiments.datasplits.DataSplitConfig
- name: str
- repetition: int
- num_iterations: int
- validation_interval: int
- start_config: dacapo.experiments.starts.StartConfig | None
- class dacapo.experiments.TrainingIterationStats
A class to represent the training iteration statistics. It contains the loss and time taken for each iteration.
- iteration
The iteration that produced these stats.
- Type:
int
- loss
The loss value of this iteration.
- Type:
float
- time
The time it took to process this iteration.
- Type:
float
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- iteration: int
- loss: float
- time: float
- class dacapo.experiments.TrainingStats
A class used to represent Training Statistics. It contains a list of training iteration statistics. It also provides methods to add new iteration stats, delete stats after a specified iteration, get the number of iterations trained for, and convert the stats to a xarray data array.
- iteration_stats
List[TrainingIterationStats] an ordered list of training stats.
- add_iteration_stats(iteration_stats
TrainingIterationStats) -> None: Add a new set of iterations stats to the existing list of iteration stats.
- delete_after(iteration
int) -> None: Deletes training stats after a specified iteration number.
- trained_until() int
Gets the number of iterations that the model has been trained for.
- to_xarray() xr.DataArray
Converts the iteration statistics to a xarray data array.
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- iteration_stats: List[dacapo.experiments.training_iteration_stats.TrainingIterationStats]
- add_iteration_stats(iteration_stats: dacapo.experiments.training_iteration_stats.TrainingIterationStats) None
Add a new iteration stats to the current iteration stats.
- Parameters:
iteration_stats (TrainingIterationStats) – a new iteration stats object.
- Raises:
assert – if the new iteration stats do not follow the order of existing iteration stats.
Examples
>>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.iteration_stats [TrainingIterationStats(iteration=0, loss=0.1), TrainingIterationStats(iteration=1, loss=0.2), TrainingIterationStats(iteration=2, loss=0.3)]
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- delete_after(iteration: int) None
Deletes training stats after a specified iteration.
- Parameters:
iteration (int) – the iteration after which the stats are to be deleted.
- Raises:
assert – if the iteration number is less than the maximum iteration number.
Examples
>>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.delete_after(1) >>> training_stats.iteration_stats [TrainingIterationStats(iteration=0, loss=0.1)]
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- trained_until() int
The number of iterations trained for (the maximum iteration plus one). Returns zero if no iterations trained yet.
- Returns:
number of iterations that the model has been trained for.
- Return type:
int
- Raises:
assert – if the iteration stats list is empty.
Examples
>>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.trained_until() 3
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- to_xarray() xarray.DataArray
Converts the iteration stats to a data array format easily manipulatable.
- Returns:
xarray DataArray of iteration losses.
- Return type:
xr.DataArray
- Raises:
assert – if the iteration stats list is empty.
Examples
>>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.to_xarray() <xarray.DataArray (iterations: 3)> array([0.1, 0.2, 0.3]) Coordinates: * iterations (iterations) int64 0 1 2
Note
The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration.
- class dacapo.experiments.ValidationIterationScores
A class used to represent the validation iteration scores in an organized structure.
- iteration
The iteration associated with these validation scores.
- Type:
int
- scores
A list of scores per dataset, post processor
- Type:
List[List[List[float]]]
- parameters, and evaluation criterion.
Note
The scores list is structured as follows: - The outer list contains the scores for each dataset. - The middle list contains the scores for each post processor parameter. - The inner list contains the scores for each evaluation criterion.
- iteration: int
- scores: List[List[List[float]]]
- class dacapo.experiments.ValidationScores
Class representing the validation scores for a set of parameters and datasets.
- parameters
The list of parameters that are being evaluated.
- Type:
List[PostProcessorParameters]
- evaluation_scores
The scores that are collected on each iteration per PostProcessorParameters and Dataset.
- Type:
- scores
A list of evaluation scores and their associated post-processing parameters.
- Type:
- subscores(iteration_scores)
Create a new ValidationScores object with a subset of the iteration scores.
- add_iteration_scores(iteration_scores)
Add iteration scores to the list of scores.
- delete_after(iteration)
Delete scores after a specified iteration.
- validated_until()
Get the number of iterations validated for (the maximum iteration plus one).
- compare(existing_iteration_scores)
Compare iteration stats provided from elsewhere to scores we have saved locally.
- criteria()
Get the list of evaluation criteria.
- parameter_names()
Get the list of parameter names.
- to_xarray()
Convert the validation scores to an xarray DataArray.
- get_best(data, dim)
Compute the Best scores along dimension “dim” per criterion.
Notes
The scores attribute is a list of ValidationIterationScores objects, each of which contains the scores for a single iteration.
- parameters: List[dacapo.experiments.tasks.post_processors.PostProcessorParameters]
- datasets: List[dacapo.experiments.datasplits.datasets.Dataset]
- evaluation_scores: dacapo.experiments.tasks.evaluators.EvaluationScores
- subscores(iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) ValidationScores
Create a new ValidationScores object with a subset of the iteration scores.
- Parameters:
iteration_scores – The iteration scores to include in the new ValidationScores object.
- Returns:
A new ValidationScores object with the specified iteration scores.
- Raises:
ValueError – If the iteration scores are not in the list of scores.
Examples
>>> validation_scores.subscores([validation_scores.scores[0]])
Note
This method is used to create a new ValidationScores object with a subset of the iteration scores. This is useful when you want to create a new ValidationScores object that only contains the scores up to a certain iteration.
- add_iteration_scores(iteration_scores: dacapo.experiments.validation_iteration_scores.ValidationIterationScores) None
Add iteration scores to the list of scores.
- Parameters:
iteration_scores – The iteration scores to add.
- Raises:
ValueError – If the iteration scores are already in the list of scores.
Examples
>>> validation_scores.add_iteration_scores(validation_scores.scores[0])
Note
This method is used to add iteration scores to the list of scores. This is useful when you want to add scores for a new iteration to the ValidationScores object.
- delete_after(iteration: int) None
Delete scores after a specified iteration.
- Parameters:
iteration – The iteration after which to delete the scores.
- Raises:
ValueError – If the iteration scores are not in the list of scores.
Examples
>>> validation_scores.delete_after(0)
Note
This method is used to delete scores after a specified iteration. This is useful when you want to delete scores after a certain iteration.
- validated_until() int
Get the number of iterations validated for (the maximum iteration plus one).
- Returns:
The number of iterations validated for.
- Raises:
ValueError – If there are no scores.
Examples
>>> validation_scores.validated_until()
Note
This method is used to get the number of iterations validated for (the maximum iteration plus one). This is useful when you want to know how many iterations have been validated.
- compare(existing_iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) Tuple[bool, int]
Compare iteration stats provided from elsewhere to scores we have saved locally. Local scores take priority. If local scores are at a lower iteration than the existing ones, delete the existing ones and replace with local. If local iteration > existing iteration, just update existing scores with the last overhanging local scores.
- Parameters:
existing_iteration_scores – The existing iteration scores to compare with.
- Returns:
A tuple indicating whether the local scores should replace the existing ones and the existing iteration number.
- Raises:
ValueError – If the iteration scores are not in the list of scores.
Examples
>>> validation_scores.compare([validation_scores.scores[0]])
Note
This method is used to compare iteration stats provided from elsewhere to scores we have saved locally. Local scores take priority. If local scores are at a lower iteration than the existing ones, delete the existing ones and replace with local. If local iteration > existing iteration, just update existing scores with the last overhanging local scores.
- property criteria: List[str]
Get the list of evaluation criteria.
- Returns:
The list of evaluation criteria.
- Raises:
ValueError – If there are no scores.
Examples
>>> validation_scores.criteria
Note
This property is used to get the list of evaluation criteria. This is useful when you want to know what criteria are being used to evaluate the scores.
- property parameter_names: List[str]
Get the list of parameter names.
- Returns:
The list of parameter names.
- Raises:
ValueError – If there are no scores.
Examples
>>> validation_scores.parameter_names
Note
This property is used to get the list of parameter names. This is useful when you want to know what parameters are being used to evaluate the scores.
- to_xarray() xarray.DataArray
Convert the validation scores to an xarray DataArray.
- Returns:
An xarray DataArray representing the validation scores.
- Raises:
ValueError – If there are no scores.
Examples
>>> validation_scores.to_xarray()
Note
This method is used to convert the validation scores to an xarray DataArray. This is useful when you want to work with the validation scores as an xarray DataArray.
- get_best(data: xarray.DataArray, dim: str) Tuple[xarray.DataArray, xarray.DataArray]
Compute the Best scores along dimension “dim” per criterion. Returns both the index associated with the best value, and the best value in two separate arrays.
- Parameters:
data – The data array to compute the best scores from.
dim – The dimension along which to compute the best scores.
- Returns:
A tuple containing the index associated with the best value and the best value in two separate arrays.
- Raises:
ValueError – If the criteria are not in the data array.
Examples
>>> validation_scores.get_best(data, "iterations")
Note
This method is used to compute the Best scores along dimension “dim” per criterion. It returns both the index associated with the best value and the best value in two separate arrays. This is useful when you want to know the best scores for a given data array. Fix: The method is currently not able to handle the case where the criteria are not in the data array. To fix this, we need to add a check to see if the criteria are in the data array and raise an error if they are not.