dacapo.experiments.datasplits.datasets.arrays

Submodules

Package Contents

Classes

Array

Helper class that provides a standard way to create an ABC using

ArrayConfig

Base class for array configurations. Each subclass of an

DummyArray

This is just a dummy array for testing.

DummyArrayConfig

This is just a dummy array config used for testing. None of the

ZarrArray

This is a zarr array

ZarrArrayConfig

This config class provides the necessary configuration for a zarr array

BinarizeArray

This is wrapper around a ZarrArray containing uint annotations.

BinarizeArrayConfig

This config class provides the necessary configuration for turning an Annotated dataset into a

ResampledArray

This is a zarr array

ResampledArrayConfig

This array will up or down sample an array into the desired voxel size.

IntensitiesArray

This is wrapper another array that will normalize intensities to

IntensitiesArrayConfig

This config class provides the necessary configuration for turning an Annotated dataset into a

MissingAnnotationsMask

This is wrapper around a ZarrArray containing uint annotations.

MissingAnnotationsMaskConfig

This config class provides the necessary configuration for turning an Annotated dataset into a

OnesArray

This is a wrapper around another source_array that simply provides ones

OnesArrayConfig

This array read data from the source array and then return a np.ones_like() version.

ConcatArray

This is a wrapper around other source_arrays that concatenates

ConcatArrayConfig

This array read data from the source array and then return a np.ones_like() version.

LogicalOrArray

LogicalOrArrayConfig

This config class takes a source array and performs a logical or over the channels.

CropArray

Used to crop a larger array to a smaller array.

CropArrayConfig

This config class provides the necessary configuration for cropping an

MergeInstancesArray

MergeInstancesArrayConfig

Base class for array configurations. Each subclass of an

DVIDArray

This is a DVID array

DVIDArrayConfig

This config class provides the necessary configuration for a DVID array

SumArray

SumArrayConfig

Base class for array configurations. Each subclass of an

NumpyArray

This is just a wrapper for a numpy array to make it fit the DaCapo Array interface.

class dacapo.experiments.datasplits.datasets.arrays.Array

Helper class that provides a standard way to create an ABC using inheritance.

abstract property attrs: Dict[str, Any]

Return a dictionary of metadata attributes stored on this array.

abstract property axes: List[str]

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

abstract property dims: int

Returns the number of spatial dimensions.

abstract property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

abstract property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

abstract property dtype: Any

The dtype of this array, in numpy dtypes

abstract property num_channels: int | None

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

abstract property data: numpy.ndarray

Get a numpy like readable and writable view into this array.

abstract property writable: bool

Can we write to this Array?

class dacapo.experiments.datasplits.datasets.arrays.ArrayConfig

Base class for array configurations. Each subclass of an Array should have a corresponding config class derived from ArrayConfig.

name: str
verify() Tuple[bool, str]

Check whether this is a valid Array

class dacapo.experiments.datasplits.datasets.arrays.DummyArray(array_config)

This is just a dummy array for testing.

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims

Returns the number of spatial dimensions.

property voxel_size

The size of a voxel in physical units.

property roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property data

Get a numpy like readable and writable view into this array.

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

class dacapo.experiments.datasplits.datasets.arrays.DummyArrayConfig

This is just a dummy array config used for testing. None of the attributes have any particular meaning.

array_type
verify() Tuple[bool, str]

Check whether this is a valid Array

class dacapo.experiments.datasplits.datasets.arrays.ZarrArray(array_config)

This is a zarr array

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property writable: bool

Can we write to this Array?

property dtype: Any

The dtype of this array, in numpy dtypes

property num_channels: int | None

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property spatial_axes: List[str]
property data: Any

Get a numpy like readable and writable view into this array.

voxel_size() funlib.geometry.Coordinate

The size of a voxel in physical units.

roi() funlib.geometry.Roi

The total ROI of this array, in world units.

classmethod create_from_array_identifier(array_identifier, axes, roi, num_channels, voxel_size, dtype, write_size=None, name=None, overwrite=False)

Create a new ZarrArray given an array identifier. It is assumed that this array_identifier points to a dataset that does not yet exist

classmethod open_from_array_identifier(array_identifier, name='')
add_metadata(metadata: Dict[str, Any]) None
class dacapo.experiments.datasplits.datasets.arrays.ZarrArrayConfig

This config class provides the necessary configuration for a zarr array

array_type
file_name: pathlib.Path
dataset: str
snap_to_grid: funlib.geometry.Coordinate | None
verify() Tuple[bool, str]

Check whether this is a valid Array

class dacapo.experiments.datasplits.datasets.arrays.BinarizeArray(array_config)

This is wrapper around a ZarrArray containing uint annotations. Because we often want to predict classes that are a combination of a set of labels we wrap a ZarrArray with the BinarizeArray and provide something like groupings=[(“mito”, [3,4,5])] where 4 corresponds to mito_membrane, 5 is mito_ribos, and 3 is everything else that is part of a mitochondria. The BinarizeArray will simply combine labels 3,4,5 into a single binary channel for th class of “mito”. We use a single channel per class because some classes may overlap. For example if you had groupings=[(“mito”, [3,4,5]), (“membrane”, [4, 8, 1])] where 4 is mito_membrane, 8 is er_membrane, and 1 is plasma_membrane. Now you can have a binary classification for membrane or not which in some cases overlaps with the channel for mitochondria which includes the mito membrane.

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels: int

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property channels
class dacapo.experiments.datasplits.datasets.arrays.BinarizeArrayConfig

This config class provides the necessary configuration for turning an Annotated dataset into a multi class binary classification problem

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
groupings: List[Tuple[str, List[int]]]
background: int
class dacapo.experiments.datasplits.datasets.arrays.ResampledArray(array_config)

This is a zarr array

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels: int

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property scale
class dacapo.experiments.datasplits.datasets.arrays.ResampledArrayConfig

This array will up or down sample an array into the desired voxel size.

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
upsample: funlib.geometry.Coordinate
downsample: funlib.geometry.Coordinate
interp_order: bool
class dacapo.experiments.datasplits.datasets.arrays.IntensitiesArray(array_config)

This is wrapper another array that will normalize intensities to the range (0, 1) and convert to float32. Use this if you have your intensities stored as uint8 or similar and want your model to have floats as input.

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels: int

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

class dacapo.experiments.datasplits.datasets.arrays.IntensitiesArrayConfig

This config class provides the necessary configuration for turning an Annotated dataset into a multi class binary classification problem

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
min: float
max: float
class dacapo.experiments.datasplits.datasets.arrays.MissingAnnotationsMask(array_config)

This is wrapper around a ZarrArray containing uint annotations. Complementary to the BinarizeArray class where we convert labels into individual channels for training, we may find crops where a specific label is present, but not annotated. In that case you might want to avoid training specific channels for specific training volumes. See package fibsem_tools for appropriate metadata format for indicating presence of labels in your ground truth. “https://github.com/janelia-cosem/fibsem-tools

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels: int

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property attrs

Return a dictionary of metadata attributes stored on this array.

property channels
class dacapo.experiments.datasplits.datasets.arrays.MissingAnnotationsMaskConfig

This config class provides the necessary configuration for turning an Annotated dataset into a multi class binary classification problem

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
groupings: List[Tuple[str, List[int]]]
class dacapo.experiments.datasplits.datasets.arrays.OnesArray(array_config)

This is a wrapper around another source_array that simply provides ones with the same metadata as the source_array.

property attrs

Return a dictionary of metadata attributes stored on this array.

property source_array: dacapo.experiments.datasplits.datasets.arrays.array.Array
property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims

Returns the number of spatial dimensions.

property voxel_size

The size of a voxel in physical units.

property roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property data

Get a numpy like readable and writable view into this array.

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

classmethod like(array: dacapo.experiments.datasplits.datasets.arrays.array.Array)
class dacapo.experiments.datasplits.datasets.arrays.OnesArrayConfig

This array read data from the source array and then return a np.ones_like() version.

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
class dacapo.experiments.datasplits.datasets.arrays.ConcatArray(array_config)

This is a wrapper around other source_arrays that concatenates them along the channel dimension.

property attrs

Return a dictionary of metadata attributes stored on this array.

property source_arrays: Dict[str, dacapo.experiments.datasplits.datasets.arrays.array.Array]
property source_array: dacapo.experiments.datasplits.datasets.arrays.array.Array
property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims

Returns the number of spatial dimensions.

property voxel_size

The size of a voxel in physical units.

property roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property data

Get a numpy like readable and writable view into this array.

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

class dacapo.experiments.datasplits.datasets.arrays.ConcatArrayConfig

This array read data from the source array and then return a np.ones_like() version.

array_type
channels: List[str]
source_array_configs: Dict[str, dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig]
default_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig | None
class dacapo.experiments.datasplits.datasets.arrays.LogicalOrArray(array_config)
property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property attrs

Return a dictionary of metadata attributes stored on this array.

class dacapo.experiments.datasplits.datasets.arrays.LogicalOrArrayConfig

This config class takes a source array and performs a logical or over the channels. Good for union multiple masks.

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
class dacapo.experiments.datasplits.datasets.arrays.CropArray(array_config)

Used to crop a larger array to a smaller array.

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels: int

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property channels
class dacapo.experiments.datasplits.datasets.arrays.CropArrayConfig

This config class provides the necessary configuration for cropping an Array to a smaller ROI. Especially useful for validation volumes that may be too large for quick evaluation

array_type
source_array_config: dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig
roi: funlib.geometry.Roi
class dacapo.experiments.datasplits.datasets.arrays.MergeInstancesArray(array_config)
property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property attrs

Return a dictionary of metadata attributes stored on this array.

class dacapo.experiments.datasplits.datasets.arrays.MergeInstancesArrayConfig

Base class for array configurations. Each subclass of an Array should have a corresponding config class derived from ArrayConfig.

array_type
source_array_configs: List[dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig]
class dacapo.experiments.datasplits.datasets.arrays.DVIDArray(array_config)

This is a DVID array

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property writable: bool

Can we write to this Array?

property dtype: Any

The dtype of this array, in numpy dtypes

property num_channels: int | None

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property spatial_axes: List[str]
abstract property data: Any

Get a numpy like readable and writable view into this array.

attrs()

Return a dictionary of metadata attributes stored on this array.

voxel_size() funlib.geometry.Coordinate

The size of a voxel in physical units.

roi() funlib.geometry.Roi

The total ROI of this array, in world units.

abstract add_metadata(metadata: Dict[str, Any]) None
class dacapo.experiments.datasplits.datasets.arrays.DVIDArrayConfig

This config class provides the necessary configuration for a DVID array

array_type
source: Tuple[str, str, str]
verify() Tuple[bool, str]

Check whether this is a valid Array

class dacapo.experiments.datasplits.datasets.arrays.SumArray(array_config)
property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims: int

Returns the number of spatial dimensions.

property voxel_size: funlib.geometry.Coordinate

The size of a voxel in physical units.

property roi: funlib.geometry.Roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

property data

Get a numpy like readable and writable view into this array.

property attrs

Return a dictionary of metadata attributes stored on this array.

class dacapo.experiments.datasplits.datasets.arrays.SumArrayConfig

Base class for array configurations. Each subclass of an Array should have a corresponding config class derived from ArrayConfig.

array_type
source_array_configs: List[dacapo.experiments.datasplits.datasets.arrays.array_config.ArrayConfig]
class dacapo.experiments.datasplits.datasets.arrays.NumpyArray(array_config)

This is just a wrapper for a numpy array to make it fit the DaCapo Array interface.

property attrs

Return a dictionary of metadata attributes stored on this array.

property axes

Returns the axes of this dataset as a string of charactes, as they are indexed. Permitted characters are:

  • zyx for spatial dimensions

  • c for channels

  • s for samples

property dims

Returns the number of spatial dimensions.

property voxel_size

The size of a voxel in physical units.

property roi

The total ROI of this array, in world units.

property writable: bool

Can we write to this Array?

property data

Get a numpy like readable and writable view into this array.

property dtype

The dtype of this array, in numpy dtypes

property num_channels

The number of channels provided by this dataset. Should return None if the channel dimension doesn’t exist.

classmethod from_gp_array(array: gunpowder.Array)
classmethod from_np_array(array: numpy.ndarray, roi, voxel_size, axes)