dacapo.experiments ================== .. py:module:: dacapo.experiments Subpackages ----------- .. toctree:: :maxdepth: 1 /autoapi/dacapo/experiments/architectures/index /autoapi/dacapo/experiments/arraytypes/index /autoapi/dacapo/experiments/datasplits/index /autoapi/dacapo/experiments/starts/index /autoapi/dacapo/experiments/tasks/index /autoapi/dacapo/experiments/trainers/index Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/dacapo/experiments/model/index /autoapi/dacapo/experiments/run/index /autoapi/dacapo/experiments/run_config/index /autoapi/dacapo/experiments/training_iteration_stats/index /autoapi/dacapo/experiments/training_stats/index /autoapi/dacapo/experiments/validation_iteration_scores/index /autoapi/dacapo/experiments/validation_scores/index Classes ------- .. autoapisummary:: dacapo.experiments.Model dacapo.experiments.RunConfig dacapo.experiments.TrainingIterationStats dacapo.experiments.TrainingStats dacapo.experiments.ValidationIterationScores dacapo.experiments.ValidationScores Package Contents ---------------- .. py:class:: Model(architecture: dacapo.experiments.architectures.architecture.Architecture, prediction_head: torch.nn.Module, eval_activation: torch.nn.Module | None = None) A trainable DaCapo model. Consists of an ``Architecture`` and a prediction head. Models are generated by ``Predictor``s. May include an optional eval_activation that is only executed when the model is in eval mode. This is particularly useful if you want to train with something like BCELossWithLogits, since you want to avoid applying softmax while training, but apply it during evaluation. .. attribute:: architecture The architecture of the model. :type: Architecture .. attribute:: prediction_head The prediction head of the model. :type: torch.nn.Module .. attribute:: chain The architecture followed by the prediction head. :type: torch.nn.Sequential .. attribute:: num_in_channels The number of input channels. :type: int .. attribute:: input_shape The shape of the input tensor. :type: Coordinate .. attribute:: eval_input_shape The shape of the input tensor during evaluation. :type: Coordinate .. attribute:: num_out_channels The number of output channels. :type: int .. attribute:: output_shape The shape of the output :type: Coordinate .. attribute:: eval_activation The activation function to apply during evaluation. :type: torch.nn.Module | None .. method:: forward(x torch.Tensor) -> torch.Tensor: Forward pass of the model. .. method:: compute_output_shape(input_shape Coordinate) -> Tuple[int, Coordinate]: Compute the spatial shape of this model, when fed a tensor of the given spatial shape as input. .. method:: scale(voxel_size Coordinate) -> Coordinate: Scale the model by the given voxel size. .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions. .. py:attribute:: num_out_channels :type: int .. py:attribute:: num_in_channels :type: int .. py:attribute:: architecture .. py:attribute:: prediction_head .. py:attribute:: chain .. py:attribute:: input_shape .. py:attribute:: eval_input_shape .. py:attribute:: eval_activation .. py:method:: forward(x) Forward pass of the model. :param x: The input tensor. :type x: torch.Tensor :returns: The output tensor. :rtype: torch.Tensor .. rubric:: Examples >>> model = Model(architecture, prediction_head) >>> model.forward(x) torch.Tensor .. note:: The eval_activation is only applied during evaluation. This is particularly useful if you want to train with something like BCELossWithLogits, since you want to avoid applying softmax while training, but apply it during evaluation. .. py:method:: compute_output_shape(input_shape: funlib.geometry.Coordinate) -> Tuple[int, funlib.geometry.Coordinate] Compute the spatial shape (i.e., not accounting for channels and batch dimensions) of this model, when fed a tensor of the given spatial shape as input. :param input_shape: The shape of the input tensor. :type input_shape: Coordinate :returns: The number of output channels and the spatial shape of the output. :rtype: Tuple[int, Coordinate] :raises AssertionError: If the input_shape is not a Coordinate. .. rubric:: Examples >>> model = Model(architecture, prediction_head) >>> model.compute_output_shape(input_shape) (1, Coordinate(1, 1, 1)) .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions. .. py:method:: scale(voxel_size: funlib.geometry.Coordinate) -> funlib.geometry.Coordinate Scale the model by the given voxel size. :param voxel_size: The voxel size to scale the model by. :type voxel_size: Coordinate :returns: The scaled model. :rtype: Coordinate :raises AssertionError: If the voxel_size is not a Coordinate. .. rubric:: Examples >>> model = Model(architecture, prediction_head) >>> model.scale(voxel_size) Coordinate(1, 1, 1) .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions. .. py:class:: RunConfig A class to represent a configuration of a run that helps to structure all the tasks, architecture, training, and datasplit configurations. ... Attributes: ----------- task_config: `TaskConfig` A config defining the Task to run that includes deciding the output of the model and different methods to achieve the goal. architecture_config: `ArchitectureConfig` A config that defines the backbone architecture of the model. It impacts the model's performance significantly. trainer_config: `TrainerConfig` Defines how batches are generated and passed for training the model along with defining configurations like batch size, learning rate, number of cpu workers and snapshot logging. datasplit_config: `DataSplitConfig` Configures the data available for the model during training or validation phases. name: str A unique name for this run to distinguish it. repetition: int The repetition number of this run. num_iterations: int The total number of iterations to train for during this run. validation_interval: int Specifies how often to perform validation during the run. It defaults to 1000. start_config : `Optional[StartConfig]` A starting point for continued training. It is optional and can be left out. .. py:attribute:: task_config :type: dacapo.experiments.tasks.TaskConfig .. py:attribute:: architecture_config :type: dacapo.experiments.architectures.ArchitectureConfig .. py:attribute:: trainer_config :type: dacapo.experiments.trainers.TrainerConfig .. py:attribute:: datasplit_config :type: dacapo.experiments.datasplits.DataSplitConfig .. py:attribute:: name :type: str .. py:attribute:: repetition :type: int .. py:attribute:: num_iterations :type: int .. py:attribute:: validation_interval :type: int .. py:attribute:: start_config :type: Optional[dacapo.experiments.starts.StartConfig] .. py:class:: TrainingIterationStats A class to represent the training iteration statistics. It contains the loss and time taken for each iteration. .. attribute:: iteration The iteration that produced these stats. :type: int .. attribute:: loss The loss value of this iteration. :type: float .. attribute:: time The time it took to process this iteration. :type: float .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:attribute:: iteration :type: int .. py:attribute:: loss :type: float .. py:attribute:: time :type: float .. py:class:: TrainingStats A class used to represent Training Statistics. It contains a list of training iteration statistics. It also provides methods to add new iteration stats, delete stats after a specified iteration, get the number of iterations trained for, and convert the stats to a xarray data array. .. attribute:: iteration_stats List[TrainingIterationStats] an ordered list of training stats. .. method:: add_iteration_stats(iteration_stats TrainingIterationStats) -> None: Add a new set of iterations stats to the existing list of iteration stats. .. method:: delete_after(iteration int) -> None: Deletes training stats after a specified iteration number. .. method:: trained_until() -> int Gets the number of iterations that the model has been trained for. .. method:: to_xarray() -> xr.DataArray Converts the iteration statistics to a xarray data array. .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:attribute:: iteration_stats :type: List[dacapo.experiments.training_iteration_stats.TrainingIterationStats] .. py:method:: add_iteration_stats(iteration_stats: dacapo.experiments.training_iteration_stats.TrainingIterationStats) -> None Add a new iteration stats to the current iteration stats. :param iteration_stats: a new iteration stats object. :type iteration_stats: TrainingIterationStats :raises assert: if the new iteration stats do not follow the order of existing iteration stats. .. rubric:: Examples >>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.iteration_stats [TrainingIterationStats(iteration=0, loss=0.1), TrainingIterationStats(iteration=1, loss=0.2), TrainingIterationStats(iteration=2, loss=0.3)] .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:method:: delete_after(iteration: int) -> None Deletes training stats after a specified iteration. :param iteration: the iteration after which the stats are to be deleted. :type iteration: int :raises assert: if the iteration number is less than the maximum iteration number. .. rubric:: Examples >>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.delete_after(1) >>> training_stats.iteration_stats [TrainingIterationStats(iteration=0, loss=0.1)] .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:method:: trained_until() -> int The number of iterations trained for (the maximum iteration plus one). Returns zero if no iterations trained yet. :returns: number of iterations that the model has been trained for. :rtype: int :raises assert: if the iteration stats list is empty. .. rubric:: Examples >>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.trained_until() 3 .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:method:: to_xarray() -> xarray.DataArray Converts the iteration stats to a data array format easily manipulatable. :returns: xarray DataArray of iteration losses. :rtype: xr.DataArray :raises assert: if the iteration stats list is empty. .. rubric:: Examples >>> training_stats = TrainingStats() >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1)) >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2)) >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3)) >>> training_stats.to_xarray() array([0.1, 0.2, 0.3]) Coordinates: * iterations (iterations) int64 0 1 2 .. note:: The iteration stats list is structured as follows: - The outer list contains the stats for each iteration. - The inner list contains the stats for each training iteration. .. py:class:: ValidationIterationScores A class used to represent the validation iteration scores in an organized structure. .. attribute:: iteration The iteration associated with these validation scores. :type: int .. attribute:: scores A list of scores per dataset, post processor :type: List[List[List[float]]] .. attribute:: parameters, and evaluation criterion. .. note:: The scores list is structured as follows: - The outer list contains the scores for each dataset. - The middle list contains the scores for each post processor parameter. - The inner list contains the scores for each evaluation criterion. .. py:attribute:: iteration :type: int .. py:attribute:: scores :type: List[List[List[float]]] .. py:class:: ValidationScores Class representing the validation scores for a set of parameters and datasets. .. attribute:: parameters The list of parameters that are being evaluated. :type: List[PostProcessorParameters] .. attribute:: datasets The datasets that will be evaluated at each iteration. :type: List[Dataset] .. attribute:: evaluation_scores The scores that are collected on each iteration per `PostProcessorParameters` and `Dataset`. :type: EvaluationScores .. attribute:: scores A list of evaluation scores and their associated post-processing parameters. :type: List[ValidationIterationScores] .. method:: subscores(iteration_scores) Create a new ValidationScores object with a subset of the iteration scores. .. method:: add_iteration_scores(iteration_scores) Add iteration scores to the list of scores. .. method:: delete_after(iteration) Delete scores after a specified iteration. .. method:: validated_until() Get the number of iterations validated for (the maximum iteration plus one). .. method:: compare(existing_iteration_scores) Compare iteration stats provided from elsewhere to scores we have saved locally. .. method:: criteria() Get the list of evaluation criteria. .. method:: parameter_names() Get the list of parameter names. .. method:: to_xarray() Convert the validation scores to an xarray DataArray. .. method:: get_best(data, dim) Compute the Best scores along dimension "dim" per criterion. .. rubric:: Notes The `scores` attribute is a list of `ValidationIterationScores` objects, each of which contains the scores for a single iteration. .. py:attribute:: parameters :type: List[dacapo.experiments.tasks.post_processors.PostProcessorParameters] .. py:attribute:: datasets :type: List[dacapo.experiments.datasplits.datasets.Dataset] .. py:attribute:: evaluation_scores :type: dacapo.experiments.tasks.evaluators.EvaluationScores .. py:attribute:: scores :type: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores] .. py:method:: subscores(iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) -> ValidationScores Create a new ValidationScores object with a subset of the iteration scores. :param iteration_scores: The iteration scores to include in the new ValidationScores object. :returns: A new ValidationScores object with the specified iteration scores. :raises ValueError: If the iteration scores are not in the list of scores. .. rubric:: Examples >>> validation_scores.subscores([validation_scores.scores[0]]) .. note:: This method is used to create a new ValidationScores object with a subset of the iteration scores. This is useful when you want to create a new ValidationScores object that only contains the scores up to a certain iteration. .. py:method:: add_iteration_scores(iteration_scores: dacapo.experiments.validation_iteration_scores.ValidationIterationScores) -> None Add iteration scores to the list of scores. :param iteration_scores: The iteration scores to add. :raises ValueError: If the iteration scores are already in the list of scores. .. rubric:: Examples >>> validation_scores.add_iteration_scores(validation_scores.scores[0]) .. note:: This method is used to add iteration scores to the list of scores. This is useful when you want to add scores for a new iteration to the ValidationScores object. .. py:method:: delete_after(iteration: int) -> None Delete scores after a specified iteration. :param iteration: The iteration after which to delete the scores. :raises ValueError: If the iteration scores are not in the list of scores. .. rubric:: Examples >>> validation_scores.delete_after(0) .. note:: This method is used to delete scores after a specified iteration. This is useful when you want to delete scores after a certain iteration. .. py:method:: validated_until() -> int Get the number of iterations validated for (the maximum iteration plus one). :returns: The number of iterations validated for. :raises ValueError: If there are no scores. .. rubric:: Examples >>> validation_scores.validated_until() .. note:: This method is used to get the number of iterations validated for (the maximum iteration plus one). This is useful when you want to know how many iterations have been validated. .. py:method:: compare(existing_iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) -> Tuple[bool, int] Compare iteration stats provided from elsewhere to scores we have saved locally. Local scores take priority. If local scores are at a lower iteration than the existing ones, delete the existing ones and replace with local. If local iteration > existing iteration, just update existing scores with the last overhanging local scores. :param existing_iteration_scores: The existing iteration scores to compare with. :returns: A tuple indicating whether the local scores should replace the existing ones and the existing iteration number. :raises ValueError: If the iteration scores are not in the list of scores. .. rubric:: Examples >>> validation_scores.compare([validation_scores.scores[0]]) .. note:: This method is used to compare iteration stats provided from elsewhere to scores we have saved locally. Local scores take priority. If local scores are at a lower iteration than the existing ones, delete the existing ones and replace with local. If local iteration > existing iteration, just update existing scores with the last overhanging local scores. .. py:property:: criteria :type: List[str] Get the list of evaluation criteria. :returns: The list of evaluation criteria. :raises ValueError: If there are no scores. .. rubric:: Examples >>> validation_scores.criteria .. note:: This property is used to get the list of evaluation criteria. This is useful when you want to know what criteria are being used to evaluate the scores. .. py:property:: parameter_names :type: List[str] Get the list of parameter names. :returns: The list of parameter names. :raises ValueError: If there are no scores. .. rubric:: Examples >>> validation_scores.parameter_names .. note:: This property is used to get the list of parameter names. This is useful when you want to know what parameters are being used to evaluate the scores. .. py:method:: to_xarray() -> xarray.DataArray Convert the validation scores to an xarray DataArray. :returns: An xarray DataArray representing the validation scores. :raises ValueError: If there are no scores. .. rubric:: Examples >>> validation_scores.to_xarray() .. note:: This method is used to convert the validation scores to an xarray DataArray. This is useful when you want to work with the validation scores as an xarray DataArray. .. py:method:: get_best(data: xarray.DataArray, dim: str) -> Tuple[xarray.DataArray, xarray.DataArray] Compute the Best scores along dimension "dim" per criterion. Returns both the index associated with the best value, and the best value in two separate arrays. :param data: The data array to compute the best scores from. :param dim: The dimension along which to compute the best scores. :returns: A tuple containing the index associated with the best value and the best value in two separate arrays. :raises ValueError: If the criteria are not in the data array. .. rubric:: Examples >>> validation_scores.get_best(data, "iterations") .. note:: This method is used to compute the Best scores along dimension "dim" per criterion. It returns both the index associated with the best value and the best value in two separate arrays. This is useful when you want to know the best scores for a given data array. Fix: The method is currently not able to handle the case where the criteria are not in the data array. To fix this, we need to add a check to see if the criteria are in the data array and raise an error if they are not.