dacapo.blockwise
Submodules
Package Contents
Classes
Definition of a |
Functions
|
Run a function in parallel over a large volume. |
|
Run a segmentation function in parallel over a large volume. |
- class dacapo.blockwise.DaCapoBlockwiseTask(worker_file: str | pathlib.Path, total_roi: daisy.Roi, read_roi: daisy.Roi, write_roi: daisy.Roi, num_workers: int = 16, max_retries: int = 2, timeout=None, upstream_tasks=None, *args, **kwargs)
Definition of a
daisytask that is to be run in a block-wise fashion.- Parameters:
name (
string) – The unique name of the task.(`class (write_roi) –
daisy.Roi`):
The region of interest (ROI) of the complete volume to process.
(`class –
daisy.Roi`):
The ROI every block needs to read data from. Will be shifted over the
total_roito cover the whole volume.(`class –
daisy.Roi`):
The ROI every block writes data from. Will be shifted over the
total_roito cover the whole volume.process_function (function) –
A function that will be called as:
process_function(block)
with
blockbeing the shifted read and write ROI for each location in the volume.If
read_write_conflictisTrue`, the callee can assume that there are no read/write concurencies, i.e., at any given point in time the ``read_roidoes not overlap with thewrite_roiof another process.check_function (function, optional) –
A function that will be called as:
check_function(block)
This function should return
Trueif the block was completed. This is used internally to avoid processing blocks that are already done and to check if a block was correctly processed.If a tuple of two functions is given, the first one will be called to check if the block needs to be run, and if so, the second one will be called after it was run to check if the run succeeded.
init_callback_fn (function, optional) –
A function that Daisy will call once when the task is started. It will be called as:
init_callback_fn(context)
Where context is the daisy.Context string that can be used by the daisy clients to connect to the server.
read_write_conflict (
bool, optional) – Whether the read and write ROIs are conflicting, i.e., accessing the same resource. If set toFalse, all blocks can run at the same time in parallel. In this case, providing aread_roiis simply a means of convenience to ensure no out-of-bound accesses and to avoid re-computation of it in each block.fit (
string, optional) –How to handle cases where shifting blocks by the size of
write_roidoes not tile thetotal_roi. Possible options are:”valid”: Skip blocks that would lie outside of
total_roi. This is the default:|---------------------------| total ROI |rrrr|wwwwww|rrrr| block 1 |rrrr|wwwwww|rrrr| block 2 no further block
”overhang”: Add all blocks that overlap with
total_roi, even if they leave it. Client code has to take care of save access beyondtotal_roiin this case.:|---------------------------| total ROI |rrrr|wwwwww|rrrr| block 1 |rrrr|wwwwww|rrrr| block 2 |rrrr|wwwwww|rrrr| block 3 (overhanging)
”shrink”: Like “overhang”, but shrink the boundary blocks’ read and write ROIs such that they are guaranteed to lie within
total_roi. The shrinking will preserve the context, i.e., the difference between the read ROI and write ROI stays the same.:|---------------------------| total ROI |rrrr|wwwwww|rrrr| block 1 |rrrr|wwwwww|rrrr| block 2 |rrrr|www|rrrr| block 3 (shrunk)
num_workers (int, optional) – The number of parallel processes to run.
max_retries (int, optional) – The maximum number of times a task will be retried if failed (either due to failed post check or application crashes or network failure)
timeout (int, optional) – Time in seconds to wait for a block to be returned from a worker. The worker is killed (and the block retried) if this time is exceeded.
- dacapo.blockwise.run_blockwise(worker_file: str | pathlib.Path, total_roi: funlib.geometry.Roi, read_roi: funlib.geometry.Roi, write_roi: funlib.geometry.Roi, num_workers: int = 16, max_retries: int = 1, timeout=None, upstream_tasks=None, *args, **kwargs)
Run a function in parallel over a large volume.
- Parameters:
worker_file (
strorPath) – The path to the file containing the necessary worker functions:spawn_workerandstart_worker. Optionally, the file can also contain acheck_functionand aninit_callback_fn.total_roi (
Roi) – The ROI to process.read_roi (
Roi) – The ROI to read from for a block.write_roi (
Roi) – The ROI to write to for a block.num_workers (
int) – The number of workers to use.max_retries (
int) – The maximum number of times a task will be retried if failed (either due to failed post check or application crashes or network failure)*args – Additional positional arguments to pass to
worker_function.**kwargs – Additional keyword arguments to pass to
worker_function.
- Returns:
Bool.
- dacapo.blockwise.segment_blockwise(segment_function_file: str | pathlib.Path, context: funlib.geometry.Coordinate, total_roi: funlib.geometry.Roi, read_roi: funlib.geometry.Roi, write_roi: funlib.geometry.Roi, num_workers: int = 16, max_retries: int = 2, timeout=None, upstream_tasks=None, *args, **kwargs)
Run a segmentation function in parallel over a large volume.
- Parameters:
segment_function_file (
strorPath) – The path to the file containing the necessary worker functions:spawn_workerandstart_worker. Optionally, the file can also contain acheck_functionand aninit_callback_fn.context (
Coordinate) – The context to add to the read and write ROI.total_roi (
Roi) – The ROI to process.read_roi (
Roi) – The ROI to read from for a block.write_roi (
Roi) – The ROI to write to for a block.num_workers (
int) – The number of workers to use.max_retries (
int) – The maximum number of times a task will be retried if failed (either due to failed post check or application crashes or network failure)timeout (
int) – The maximum time in seconds to wait for a worker to complete a task.upstream_tasks (
List) – List of upstream tasks.*args – Additional positional arguments to pass to
worker_function.**kwargs – Additional keyword arguments to pass to
worker_function.
- Returns:
Bool.