dacapo.utils.voi ================ .. py:module:: dacapo.utils.voi Functions --------- .. autoapisummary:: dacapo.utils.voi.voi dacapo.utils.voi.split_vi dacapo.utils.voi.vi_tables dacapo.utils.voi.contingency_table dacapo.utils.voi.divide_columns dacapo.utils.voi.divide_rows dacapo.utils.voi.xlogx Module Contents --------------- .. py:function:: voi(reconstruction, groundtruth, ignore_reconstruction=[], ignore_groundtruth=[0]) Return the conditional entropies of the variation of information metric. [1] Let X be a reconstruction, and Y a ground truth labelling. The variation of information between the two is the sum of two conditional entropies: VI(X, Y) = H(X|Y) + H(Y|X). The first one, H(X|Y), is a measure of oversegmentation, the second one, H(Y|X), a measure of undersegmentation. These measures are referred to as the variation of information split or merge error, respectively. :param seg: A candidate segmentation. :type seg: np.ndarray, int type, arbitrary shape :param gt: The ground truth segmentation. :type gt: np.ndarray, int type, same shape as `seg` :param ignore_seg: Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored. :type ignore_seg: list of int, optional :param ignore_gt: Any points having a label in this list are ignored in the evaluation. By default, only the label 0 in the ground truth will be ignored. :type ignore_gt: list of int, optional :returns: **(split, merge)** -- The variation of information split and merge error, i.e., H(X|Y) and H(Y|X) :rtype: float :raises ValueError: If `reconstruction` and `groundtruth` have different shapes. .. rubric:: References [1] Meila, M. (2007). Comparing clusterings - an information based distance. Journal of Multivariate Analysis 98, 873-895. .. py:function:: split_vi(x, y=None, ignore_x=[0], ignore_y=[0]) Return the symmetric conditional entropies associated with the VI. The variation of information is defined as VI(X,Y) = H(X|Y) + H(Y|X). If Y is the ground-truth segmentation, then H(Y|X) can be interpreted as the amount of under-segmentation of Y and H(X|Y) is then the amount of over-segmentation. In other words, a perfect over-segmentation will have H(Y|X)=0 and a perfect under-segmentation will have H(X|Y)=0. If y is None, x is assumed to be a contingency table. :param x: Label field (int type) or contingency table (float). `x` is interpreted as a contingency table (summing to 1.0) if and only if `y` is not provided. :type x: np.ndarray :param y: A label field to compare to `x`. :type y: np.ndarray of int, same shape as x, optional :param ignore_x: Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default. :type ignore_x: list of int, optional :param ignore_y: Any points having a label in this list are ignored in the evaluation. Ignore 0-labeled points by default. :type ignore_y: list of int, optional :returns: **sv** -- The conditional entropies of Y|X and X|Y. :rtype: np.ndarray of float, shape (2,) .. seealso:: :obj:`vi` .. py:function:: vi_tables(x, y=None, ignore_x=[0], ignore_y=[0]) Return probability tables used for calculating VI. If y is None, x is assumed to be a contingency table. :param x: Either x and y are provided as equal-shaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that may or may not sum to 1. :type x: np.ndarray :param y: Either x and y are provided as equal-shaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that may or may not sum to 1. :type y: np.ndarray :param ignore_x: Rows and columns (respectively) to ignore in the contingency table. These are labels that are not counted when evaluating VI. :type ignore_x: list of int, optional :param ignore_y: Rows and columns (respectively) to ignore in the contingency table. These are labels that are not counted when evaluating VI. :type ignore_y: list of int, optional :returns: * **pxy** (*sparse.csc_matrix of float*) -- The normalized contingency table. * **px, py, hxgy, hygx, lpygx, lpxgy** (*np.ndarray of float*) -- The proportions of each label in `x` and `y` (`px`, `py`), the per-segment conditional entropies of `x` given `y` and vice-versa, the per-segment conditional probability p log p. :raises ValueError: If `x` and `y` have different shapes. .. py:function:: contingency_table(seg, gt, ignore_seg=[0], ignore_gt=[0], norm=True) Return the contingency table for all regions in matched segmentations. :param seg: A candidate segmentation. :type seg: np.ndarray, int type, arbitrary shape :param gt: The ground truth segmentation. :type gt: np.ndarray, int type, same shape as `seg` :param ignore_seg: Values to ignore in `seg`. Voxels in `seg` having a value in this list will not contribute to the contingency table. (default: [0]) :type ignore_seg: list of int, optional :param ignore_gt: Values to ignore in `gt`. Voxels in `gt` having a value in this list will not contribute to the contingency table. (default: [0]) :type ignore_gt: list of int, optional :param norm: Whether to normalize the table so that it sums to 1. :type norm: bool, optional :returns: **cont** -- A contingency table. `cont[i, j]` will equal the number of voxels labeled `i` in `seg` and `j` in `gt`. (Or the proportion of such voxels if `norm=True`.) :rtype: scipy.sparse.csc_matrix :raises ValueError: If `seg` and `gt` have different shapes. .. py:function:: divide_columns(matrix, row, in_place=False) Divide each column of `matrix` by the corresponding element in `row`. The result is as follows: out[i, j] = matrix[i, j] / row[j] :param matrix: The input matrix. :type matrix: np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N) :param column: The row dividing `matrix`. :type column: a 1D np.ndarray, shape (N,) :param in_place: Do the computation in-place. :type in_place: bool (optional, default False) :returns: **out** -- The result of the row-wise division. :rtype: same type as `matrix` :raises ValueError: If `row` contains zeros. .. py:function:: divide_rows(matrix, column, in_place=False) Divide each row of `matrix` by the corresponding element in `column`. The result is as follows: out[i, j] = matrix[i, j] / column[i] :param matrix: The input matrix. :type matrix: np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N) :param column: The column dividing `matrix`. :type column: a 1D np.ndarray, shape (M,) :param in_place: Do the computation in-place. :type in_place: bool (optional, default False) :returns: **out** -- The result of the row-wise division. :rtype: same type as `matrix` :raises ValueError: If `column` contains zeros. .. py:function:: xlogx(x, out=None, in_place=False) Compute x * log_2(x). We define 0 * log_2(0) = 0 :param x: The input array. :type x: np.ndarray or scipy.sparse.csc_matrix or csr_matrix :param out: If provided, use this array/matrix for the result. :type out: same type as x (optional) :param in_place: Operate directly on x. :type in_place: bool (optional, default False) :returns: **y** -- Result of x * log_2(x). :rtype: same type as x :raises ValueError: If x contains negative values.