tacco.tools.distance_matrix

distance_matrix(adata, max_distance, position_key=['x', 'y'], base_distance_key=None, result_key=None, annotation_key=None, annotation_distance=None, distance_scale=None, annotation_distance_scale=None, coo_result=False, low_mem=False, verbose=1, **kw_args)[source]

Calculates a sparse or dense distance matrix.

Parameters:
  • adata – A AnnData

  • max_distance – The maximum distance to calculate. All larger distances are unset in the result (which acts as a 0 for sparse matrices…). None and np.inf result in dense distance computation (which can be infeasible for larger datasets).

  • position_key – The .obsm key or array-like of .obs keys with the position space coordinates

  • base_distance_key – The .obsp key containing a precomputed distance matrix to update with annotation distance. If None, the distances are recomputed with the positions found in position_key. Otherwise position_key is ignored. If .obsp[base_distance_key] does not exist, the distances are also recomputed and then written to .obsp[base_distance_key].

  • result_key – The .obsp key to contain the distance matrix. If None, a csr_matrix containing the distances is returned.

  • annotation_key – The .obs key for a categorical annotation to split the data before calculating distances. If None, the distances are calculated on the full dataset.

  • annotation_distance

    Specifies the effect of annotation_key in adding a distances between two observations of different type. It can be:

    • a scalar to use for all annotation pairs

    • a DataFrame to give every annotation pair its own finite distance. If some should retain infinite distance, use np.inf, np.nan or negative values

    • None to use an infinite distance between different annotations

    • a metric to calculate a distance between the annotation profiles. This is forwarded to cdist() as the metric argument, so everything available there is also posible here, e.g. ‘h2’.

  • distance_scale – The distance scale of the relevant local neighbourhoods. If supplied, annotation_distance is scaled such that its mean between different types has the same value as this distance_scale.

  • annotation_distance_scale – A scalar to facilitate conversion between distances in type-space and position-space. This parameter directly specifies the scaling factor of annotation_distance and overrides the distance_scale setting. If None, the bare annotation distances are used. If None and distance_scale is None and annotation_distance is a metric specification an exception is raised as position distance and annotation distance cannot be assumed to be comparable.

  • coo_result – Whether to return the result as coo_matrix instead of a csr_matrix. This is faster as it avoids conversion at the end, but if written to an adata.obsp key, the adata cannot be subsetted anymore… Ignored for dense distances.

  • low_mem – Whether to use memory optimization which run longer and may use the harddisc but have the potential to reduce the memory consumption by a factor of 2.

  • verbose – Level of verbosity, with 0 (no output), 1 (some output), …

  • **kw_args – Additional keyword arguments are forwarded to on-the-fly distance calculation if necessary. Depending on max_distance, this goes to sparse_distance_matrix() or dense_distance_matrix().

Returns:

Depending on result_key returns either a sparse or dense distance matrix or an updated input adata containing the distance matrix under adata.obsp[result_key].