dacapo.experiments
==================

.. py:module:: dacapo.experiments


Subpackages
-----------

.. toctree::
   :maxdepth: 1

   /autoapi/dacapo/experiments/architectures/index
   /autoapi/dacapo/experiments/arraytypes/index
   /autoapi/dacapo/experiments/datasplits/index
   /autoapi/dacapo/experiments/starts/index
   /autoapi/dacapo/experiments/tasks/index
   /autoapi/dacapo/experiments/trainers/index


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/dacapo/experiments/model/index
   /autoapi/dacapo/experiments/run/index
   /autoapi/dacapo/experiments/run_config/index
   /autoapi/dacapo/experiments/training_iteration_stats/index
   /autoapi/dacapo/experiments/training_stats/index
   /autoapi/dacapo/experiments/validation_iteration_scores/index
   /autoapi/dacapo/experiments/validation_scores/index


Classes
-------

.. autoapisummary::

   dacapo.experiments.Model
   dacapo.experiments.RunConfig
   dacapo.experiments.TrainingIterationStats
   dacapo.experiments.TrainingStats
   dacapo.experiments.ValidationIterationScores
   dacapo.experiments.ValidationScores


Package Contents
----------------

.. py:class:: Model(architecture: dacapo.experiments.architectures.architecture.Architecture, prediction_head: torch.nn.Module, eval_activation: torch.nn.Module | None = None)


   A trainable DaCapo model. Consists of an ``Architecture`` and a
   prediction head. Models are generated by ``Predictor``s.

   May include an optional eval_activation that is only executed when the model
   is in eval mode. This is particularly useful if you want to train with something
   like BCELossWithLogits, since you want to avoid applying softmax while training,
   but apply it during evaluation.

   .. attribute:: architecture

      The architecture of the model.

      :type: Architecture

   .. attribute:: prediction_head

      The prediction head of the model.

      :type: torch.nn.Module

   .. attribute:: chain

      The architecture followed by the prediction head.

      :type: torch.nn.Sequential

   .. attribute:: num_in_channels

      The number of input channels.

      :type: int

   .. attribute:: input_shape

      The shape of the input tensor.

      :type: Coordinate

   .. attribute:: eval_input_shape

      The shape of the input tensor during evaluation.

      :type: Coordinate

   .. attribute:: num_out_channels

      The number of output channels.

      :type: int

   .. attribute:: output_shape

      The shape of the output

      :type: Coordinate

   .. attribute:: eval_activation

      The activation function to apply during evaluation.

      :type: torch.nn.Module | None

   .. method:: forward(x

      torch.Tensor) -> torch.Tensor:
      Forward pass of the model.

   .. method:: compute_output_shape(input_shape

      Coordinate) -> Tuple[int, Coordinate]:
      Compute the spatial shape of this model, when fed a tensor of the given spatial shape as input.

   .. method:: scale(voxel_size

      Coordinate) -> Coordinate:
      Scale the model by the given voxel size.

   .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.


   .. py:attribute:: num_out_channels
      :type:  int


   .. py:attribute:: num_in_channels
      :type:  int


   .. py:attribute:: architecture


   .. py:attribute:: prediction_head


   .. py:attribute:: chain


   .. py:attribute:: input_shape


   .. py:attribute:: eval_input_shape


   .. py:attribute:: eval_activation


   .. py:method:: forward(x)

      Forward pass of the model.

      :param x: The input tensor.
      :type x: torch.Tensor

      :returns: The output tensor.
      :rtype: torch.Tensor

      .. rubric:: Examples

      >>> model = Model(architecture, prediction_head)
      >>> model.forward(x)
      torch.Tensor

      .. note:: The eval_activation is only applied during evaluation. This is particularly useful if you want to train with something like BCELossWithLogits, since you want to avoid applying softmax while training, but apply it during evaluation.


   .. py:method:: compute_output_shape(input_shape: funlib.geometry.Coordinate) -> Tuple[int, funlib.geometry.Coordinate]

      Compute the spatial shape (i.e., not accounting for channels and
      batch dimensions) of this model, when fed a tensor of the given spatial
      shape as input.

      :param input_shape: The shape of the input tensor.
      :type input_shape: Coordinate

      :returns: The number of output channels and the spatial shape of the output.
      :rtype: Tuple[int, Coordinate]

      :raises AssertionError: If the input_shape is not a Coordinate.

      .. rubric:: Examples

      >>> model = Model(architecture, prediction_head)
      >>> model.compute_output_shape(input_shape)
      (1, Coordinate(1, 1, 1))

      .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.


   .. py:method:: scale(voxel_size: funlib.geometry.Coordinate) -> funlib.geometry.Coordinate

      Scale the model by the given voxel size.

      :param voxel_size: The voxel size to scale the model by.
      :type voxel_size: Coordinate

      :returns: The scaled model.
      :rtype: Coordinate

      :raises AssertionError: If the voxel_size is not a Coordinate.

      .. rubric:: Examples

      >>> model = Model(architecture, prediction_head)
      >>> model.scale(voxel_size)
      Coordinate(1, 1, 1)

      .. note:: The output shape is the spatial shape of the model, i.e., not accounting for channels and batch dimensions.


.. py:class:: RunConfig

   A class to represent a configuration of a run that helps to structure all the tasks,
   architecture, training, and datasplit configurations.

   ...

   Attributes:
   -----------
   task_config: `TaskConfig`
       A config defining the Task to run that includes deciding the output of the model and
       different methods to achieve the goal.

   architecture_config: `ArchitectureConfig`
        A config that defines the backbone architecture of the model. It impacts the model's
        performance significantly.

   trainer_config: `TrainerConfig`
       Defines how batches are generated and passed for training the model along with defining
       configurations like batch size, learning rate, number of cpu workers and snapshot logging.

   datasplit_config: `DataSplitConfig`
       Configures the data available for the model during training or validation phases.

   name: str
       A unique name for this run to distinguish it.

   repetition: int
       The repetition number of this run.

   num_iterations: int
       The total number of iterations to train for during this run.

   validation_interval: int
       Specifies how often to perform validation during the run. It defaults to 1000.

   start_config : `Optional[StartConfig]`
       A starting point for continued training. It is optional and can be left out.


   .. py:attribute:: task_config
      :type:  dacapo.experiments.tasks.TaskConfig


   .. py:attribute:: architecture_config
      :type:  dacapo.experiments.architectures.ArchitectureConfig


   .. py:attribute:: trainer_config
      :type:  dacapo.experiments.trainers.TrainerConfig


   .. py:attribute:: datasplit_config
      :type:  dacapo.experiments.datasplits.DataSplitConfig


   .. py:attribute:: name
      :type:  str


   .. py:attribute:: repetition
      :type:  int


   .. py:attribute:: num_iterations
      :type:  int


   .. py:attribute:: validation_interval
      :type:  int


   .. py:attribute:: start_config
      :type:  Optional[dacapo.experiments.starts.StartConfig]


.. py:class:: TrainingIterationStats

   A class to represent the training iteration statistics. It contains the loss and time taken for each iteration.

   .. attribute:: iteration

      The iteration that produced these stats.

      :type: int

   .. attribute:: loss

      The loss value of this iteration.

      :type: float

   .. attribute:: time

      The time it took to process this iteration.

      :type: float

   .. note::

      The iteration stats list is structured as follows:
      - The outer list contains the stats for each iteration.
      - The inner list contains the stats for each training iteration.


   .. py:attribute:: iteration
      :type:  int


   .. py:attribute:: loss
      :type:  float


   .. py:attribute:: time
      :type:  float


.. py:class:: TrainingStats

   A class used to represent Training Statistics. It contains a list of training
   iteration statistics. It also provides methods to add new iteration stats,
   delete stats after a specified iteration, get the number of iterations trained
   for, and convert the stats to a xarray data array.

   .. attribute:: iteration_stats

      List[TrainingIterationStats]
      an ordered list of training stats.

   .. method:: add_iteration_stats(iteration_stats

      TrainingIterationStats) -> None:
      Add a new set of iterations stats to the existing list of iteration
      stats.

   .. method:: delete_after(iteration

      int) -> None:
      Deletes training stats after a specified iteration number.

   .. method:: trained_until() -> int

      
      Gets the number of iterations that the model has been trained for.

   .. method:: to_xarray() -> xr.DataArray

      
      Converts the iteration statistics to a xarray data array.

   .. note::

      The iteration stats list is structured as follows:
      - The outer list contains the stats for each iteration.
      - The inner list contains the stats for each training iteration.


   .. py:attribute:: iteration_stats
      :type:  List[dacapo.experiments.training_iteration_stats.TrainingIterationStats]


   .. py:method:: add_iteration_stats(iteration_stats: dacapo.experiments.training_iteration_stats.TrainingIterationStats) -> None

      Add a new iteration stats to the current iteration stats.

      :param iteration_stats: a new iteration stats object.
      :type iteration_stats: TrainingIterationStats

      :raises assert: if the new iteration stats do not follow the order of existing iteration stats.

      .. rubric:: Examples

      >>> training_stats = TrainingStats()
      >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3))
      >>> training_stats.iteration_stats
      [TrainingIterationStats(iteration=0, loss=0.1),
       TrainingIterationStats(iteration=1, loss=0.2),
       TrainingIterationStats(iteration=2, loss=0.3)]

      .. note::

         The iteration stats list is structured as follows:
         - The outer list contains the stats for each iteration.
         - The inner list contains the stats for each training iteration.


   .. py:method:: delete_after(iteration: int) -> None

      Deletes training stats after a specified iteration.

      :param iteration: the iteration after which the stats are to be deleted.
      :type iteration: int

      :raises assert: if the iteration number is less than the maximum iteration number.

      .. rubric:: Examples

      >>> training_stats = TrainingStats()
      >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3))
      >>> training_stats.delete_after(1)
      >>> training_stats.iteration_stats
      [TrainingIterationStats(iteration=0, loss=0.1)]

      .. note::

         The iteration stats list is structured as follows:
         - The outer list contains the stats for each iteration.
         - The inner list contains the stats for each training iteration.


   .. py:method:: trained_until() -> int

      The number of iterations trained for (the maximum iteration plus one).
      Returns zero if no iterations trained yet.

      :returns: number of iterations that the model has been trained for.
      :rtype: int

      :raises assert: if the iteration stats list is empty.

      .. rubric:: Examples

      >>> training_stats = TrainingStats()
      >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3))
      >>> training_stats.trained_until()
      3

      .. note::

         The iteration stats list is structured as follows:
         - The outer list contains the stats for each iteration.
         - The inner list contains the stats for each training iteration.


   .. py:method:: to_xarray() -> xarray.DataArray

      Converts the iteration stats to a data array format easily manipulatable.

      :returns: xarray DataArray of iteration losses.
      :rtype: xr.DataArray

      :raises assert: if the iteration stats list is empty.

      .. rubric:: Examples

      >>> training_stats = TrainingStats()
      >>> training_stats.add_iteration_stats(TrainingIterationStats(0, 0.1))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(1, 0.2))
      >>> training_stats.add_iteration_stats(TrainingIterationStats(2, 0.3))
      >>> training_stats.to_xarray()
      <xarray.DataArray (iterations: 3)>
      array([0.1, 0.2, 0.3])
      Coordinates:
        * iterations  (iterations) int64 0 1 2

      .. note::

         The iteration stats list is structured as follows:
         - The outer list contains the stats for each iteration.
         - The inner list contains the stats for each training iteration.


.. py:class:: ValidationIterationScores

   A class used to represent the validation iteration scores in an organized structure.

   .. attribute:: iteration

      The iteration associated with these validation scores.

      :type: int

   .. attribute:: scores

      A list of scores per dataset, post processor

      :type: List[List[List[float]]]

   .. attribute:: parameters, and evaluation criterion.

      
   .. note::

      The scores list is structured as follows:
      - The outer list contains the scores for each dataset.
      - The middle list contains the scores for each post processor parameter.
      - The inner list contains the scores for each evaluation criterion.


   .. py:attribute:: iteration
      :type:  int


   .. py:attribute:: scores
      :type:  List[List[List[float]]]


.. py:class:: ValidationScores

   Class representing the validation scores for a set of parameters and datasets.

   .. attribute:: parameters

      The list of parameters that are being evaluated.

      :type: List[PostProcessorParameters]

   .. attribute:: datasets

      The datasets that will be evaluated at each iteration.

      :type: List[Dataset]

   .. attribute:: evaluation_scores

      The scores that are collected on each iteration per
      `PostProcessorParameters` and `Dataset`.

      :type: EvaluationScores

   .. attribute:: scores

      A list of evaluation scores and their associated
      post-processing parameters.

      :type: List[ValidationIterationScores]

   .. method:: subscores(iteration_scores)

      Create a new ValidationScores object with a subset of the iteration scores.

   .. method:: add_iteration_scores(iteration_scores)

      Add iteration scores to the list of scores.

   .. method:: delete_after(iteration)

      Delete scores after a specified iteration.

   .. method:: validated_until()

      Get the number of iterations validated for (the maximum iteration plus one).

   .. method:: compare(existing_iteration_scores)

      Compare iteration stats provided from elsewhere to scores we have saved locally.

   .. method:: criteria()

      Get the list of evaluation criteria.

   .. method:: parameter_names()

      Get the list of parameter names.

   .. method:: to_xarray()

      Convert the validation scores to an xarray DataArray.

   .. method:: get_best(data, dim)

      Compute the Best scores along dimension "dim" per criterion.

   .. rubric:: Notes

   The `scores` attribute is a list of `ValidationIterationScores` objects, each of which
   contains the scores for a single iteration.


   .. py:attribute:: parameters
      :type:  List[dacapo.experiments.tasks.post_processors.PostProcessorParameters]


   .. py:attribute:: datasets
      :type:  List[dacapo.experiments.datasplits.datasets.Dataset]


   .. py:attribute:: evaluation_scores
      :type:  dacapo.experiments.tasks.evaluators.EvaluationScores


   .. py:attribute:: scores
      :type:  List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]


   .. py:method:: subscores(iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) -> ValidationScores

      Create a new ValidationScores object with a subset of the iteration scores.

      :param iteration_scores: The iteration scores to include in the new ValidationScores object.

      :returns: A new ValidationScores object with the specified iteration scores.

      :raises ValueError: If the iteration scores are not in the list of scores.

      .. rubric:: Examples

      >>> validation_scores.subscores([validation_scores.scores[0]])

      .. note::

         This method is used to create a new ValidationScores object with a subset of the
         iteration scores. This is useful when you want to create a new ValidationScores object
         that only contains the scores up to a certain iteration.


   .. py:method:: add_iteration_scores(iteration_scores: dacapo.experiments.validation_iteration_scores.ValidationIterationScores) -> None

      Add iteration scores to the list of scores.

      :param iteration_scores: The iteration scores to add.

      :raises ValueError: If the iteration scores are already in the list of scores.

      .. rubric:: Examples

      >>> validation_scores.add_iteration_scores(validation_scores.scores[0])

      .. note::

         This method is used to add iteration scores to the list of scores. This is useful when
         you want to add scores for a new iteration to the ValidationScores object.


   .. py:method:: delete_after(iteration: int) -> None

      Delete scores after a specified iteration.

      :param iteration: The iteration after which to delete the scores.

      :raises ValueError: If the iteration scores are not in the list of scores.

      .. rubric:: Examples

      >>> validation_scores.delete_after(0)

      .. note::

         This method is used to delete scores after a specified iteration. This is useful when
         you want to delete scores after a certain iteration.


   .. py:method:: validated_until() -> int

      Get the number of iterations validated for (the maximum iteration plus one).

      :returns: The number of iterations validated for.

      :raises ValueError: If there are no scores.

      .. rubric:: Examples

      >>> validation_scores.validated_until()

      .. note::

         This method is used to get the number of iterations validated for (the maximum iteration
         plus one). This is useful when you want to know how many iterations have been validated.


   .. py:method:: compare(existing_iteration_scores: List[dacapo.experiments.validation_iteration_scores.ValidationIterationScores]) -> Tuple[bool, int]

      Compare iteration stats provided from elsewhere to scores we have saved locally.
      Local scores take priority. If local scores are at a lower iteration than the
      existing ones, delete the existing ones and replace with local.
      If local iteration > existing iteration, just update existing scores with the last
      overhanging local scores.

      :param existing_iteration_scores: The existing iteration scores to compare with.

      :returns: A tuple indicating whether the local scores should replace the existing ones
                and the existing iteration number.

      :raises ValueError: If the iteration scores are not in the list of scores.

      .. rubric:: Examples

      >>> validation_scores.compare([validation_scores.scores[0]])

      .. note::

         This method is used to compare iteration stats provided from elsewhere to scores we have
         saved locally. Local scores take priority. If local scores are at a lower iteration than
         the existing ones, delete the existing ones and replace with local. If local iteration >
         existing iteration, just update existing scores with the last overhanging local scores.


   .. py:property:: criteria
      :type: List[str]

      Get the list of evaluation criteria.

      :returns: The list of evaluation criteria.

      :raises ValueError: If there are no scores.

      .. rubric:: Examples

      >>> validation_scores.criteria

      .. note::

         This property is used to get the list of evaluation criteria. This is useful when you
         want to know what criteria are being used to evaluate the scores.


   .. py:property:: parameter_names
      :type: List[str]

      Get the list of parameter names.

      :returns: The list of parameter names.

      :raises ValueError: If there are no scores.

      .. rubric:: Examples

      >>> validation_scores.parameter_names

      .. note::

         This property is used to get the list of parameter names. This is useful when you want
         to know what parameters are being used to evaluate the scores.


   .. py:method:: to_xarray() -> xarray.DataArray

      Convert the validation scores to an xarray DataArray.

      :returns: An xarray DataArray representing the validation scores.

      :raises ValueError: If there are no scores.

      .. rubric:: Examples

      >>> validation_scores.to_xarray()

      .. note::

         This method is used to convert the validation scores to an xarray DataArray. This is
         useful when you want to work with the validation scores as an xarray DataArray.


   .. py:method:: get_best(data: xarray.DataArray, dim: str) -> Tuple[xarray.DataArray, xarray.DataArray]

      Compute the Best scores along dimension "dim" per criterion.
      Returns both the index associated with the best value, and the
      best value in two separate arrays.

      :param data: The data array to compute the best scores from.
      :param dim: The dimension along which to compute the best scores.

      :returns: A tuple containing the index associated with the best value and the best value
                in two separate arrays.

      :raises ValueError: If the criteria are not in the data array.

      .. rubric:: Examples

      >>> validation_scores.get_best(data, "iterations")

      .. note::

         This method is used to compute the Best scores along dimension "dim" per criterion. It
         returns both the index associated with the best value and the best value in two separate
         arrays. This is useful when you want to know the best scores for a given data array.
         Fix: The method is currently not able to handle the case where the criteria are not in the data array.
         To fix this, we need to add a check to see if the criteria are in the data array and raise an error if they are not.