dacapo.blockwise.blockwise_task

Module Contents

Classes

DaCapoBlockwiseTask

Definition of a daisy task that is to be run in a block-wise

class dacapo.blockwise.blockwise_task.DaCapoBlockwiseTask(worker_file: str | pathlib.Path, total_roi: daisy.Roi, read_roi: daisy.Roi, write_roi: daisy.Roi, num_workers: int = 16, max_retries: int = 2, timeout=None, upstream_tasks=None, *args, **kwargs)

Definition of a daisy task that is to be run in a block-wise fashion.

Parameters:
  • name (string) – The unique name of the task.

  • (`class (write_roi) –

    daisy.Roi`):

    The region of interest (ROI) of the complete volume to process.

  • (`class

    daisy.Roi`):

    The ROI every block needs to read data from. Will be shifted over the total_roi to cover the whole volume.

  • (`class

    daisy.Roi`):

    The ROI every block writes data from. Will be shifted over the total_roi to cover the whole volume.

  • process_function (function) –

    A function that will be called as:

    process_function(block)
    

    with block being the shifted read and write ROI for each location in the volume.

    If read_write_conflict is True`, the callee can assume that there are no read/write concurencies, i.e., at any given point in time the ``read_roi does not overlap with the write_roi of another process.

  • check_function (function, optional) –

    A function that will be called as:

    check_function(block)
    

    This function should return True if the block was completed. This is used internally to avoid processing blocks that are already done and to check if a block was correctly processed.

    If a tuple of two functions is given, the first one will be called to check if the block needs to be run, and if so, the second one will be called after it was run to check if the run succeeded.

  • init_callback_fn (function, optional) –

    A function that Daisy will call once when the task is started. It will be called as:

    init_callback_fn(context)
    

    Where context is the daisy.Context string that can be used by the daisy clients to connect to the server.

  • read_write_conflict (bool, optional) – Whether the read and write ROIs are conflicting, i.e., accessing the same resource. If set to False, all blocks can run at the same time in parallel. In this case, providing a read_roi is simply a means of convenience to ensure no out-of-bound accesses and to avoid re-computation of it in each block.

  • fit (string, optional) –

    How to handle cases where shifting blocks by the size of write_roi does not tile the total_roi. Possible options are:

    ”valid”: Skip blocks that would lie outside of total_roi. This is the default:

    |---------------------------|     total ROI
    
    |rrrr|wwwwww|rrrr|                block 1
           |rrrr|wwwwww|rrrr|         block 2
                                      no further block
    

    ”overhang”: Add all blocks that overlap with total_roi, even if they leave it. Client code has to take care of save access beyond total_roi in this case.:

    |---------------------------|     total ROI
    
    |rrrr|wwwwww|rrrr|                block 1
           |rrrr|wwwwww|rrrr|         block 2
                  |rrrr|wwwwww|rrrr|  block 3 (overhanging)
    

    ”shrink”: Like “overhang”, but shrink the boundary blocks’ read and write ROIs such that they are guaranteed to lie within total_roi. The shrinking will preserve the context, i.e., the difference between the read ROI and write ROI stays the same.:

    |---------------------------|     total ROI
    
    |rrrr|wwwwww|rrrr|                block 1
           |rrrr|wwwwww|rrrr|         block 2
                  |rrrr|www|rrrr|     block 3 (shrunk)
    

  • num_workers (int, optional) – The number of parallel processes to run.

  • max_retries (int, optional) – The maximum number of times a task will be retried if failed (either due to failed post check or application crashes or network failure)

  • timeout (int, optional) – Time in seconds to wait for a block to be returned from a worker. The worker is killed (and the block retried) if this time is exceeded.