tacco.tools.spectral_clustering

spectral_clustering(adata, max_size, min_size=None, affinity_key=None, result_key=None, dim=None, cut_threshold=0.7, position_key=['x', 'y'], position_scale=None, position_range=3, max_aspect_ratio=5, verbose=0)[source]

Performs spectral clustering on an affinity matrix.

Parameters:
  • adata – A AnnData with affinity under .obsp[affinity_key] or a (sparse) affinity matrix.

  • max_size – The clustering goes on until no cluster has more elements than this.

  • min_size – The clustering does not subcluster clusters smaller than this number. If None, uses max_size/5

  • affinity_key – The .obsp key with the affinities. Ignored if adata is the affinity matrix.

  • result_key – The .obs key to contain the clusters. If None, a Series containing the cluster labels is returned. Ignored if adata is the affinity matrix.

  • dim – The dimensionality of the manifold. If None, it is taken from supplied position space coordinates (if available) or being inferred on the fly. The dim is used to decide whether to subcluster a given cluster based on the surface to volume ratio.

  • cut_threshold – For every proposed subclustering a certain amount of affinity has to be cut. This number scales the decision threshold: higher values mean more cuts, lower values mean less cuts. The threshold itself scales also with the (1/dim)-th root of the cluster size.

  • position_key – The .obsm key or array-like of .obs keys with the position space coordinates. This is used to efficiently get small subproblems by spatial binning. If position_key or position_scale is None, do hirarchical clustering to iteratively split the problems in smaller subproblems. Ignored if adata is the affinity matrix.

  • position_scale – The expected feature size to use for splitting the problem spatially. If position_key or position_scale is None, do hirarchical clustering to iteratively split the problems in smaller subproblems. Ignored if adata is the affinity matrix.

  • position_range – A cluster is subclustered when it has a spatial size (defined as twice the standard deviation in the largest spatial PCA direction) of more than position_scale*position_range* and it is not subclustered if its spatial size is smaller than position_scale/position_range. Ignored if adata is the affinity matrix.

  • max_aspect_ratio – A cluster is subclustered when it has a larger aspect ratio (defined as the ratio of the standard deviations in the largest and smallest spatial PCA direction) of more than max_aspect_ratio. Ignored if adata is the affinity matrix.

  • verbose – Level of verbosity, with 0 (no output), 1 (some output), …

Returns:

Depending on result_key returns either a Series with the cluster labels or an updated input adata contining the cluster labels under adata.obs[result_key].