tacco.tools.annotate_projection

annotate_projection(adata, reference, annotation_key=None, counts_location=None, projection='bc2', deconvolution=None)[source]

Annotates an AnnData using reference data by projection.

This is the direct interface to this annotation method. In practice using the general wrapper annotate() is recommended due to its higher flexibility.

Parameters:
  • adata – An AnnData including expression data in .X.

  • reference – Reference data to get the annotation definition from.

  • annotation_key – The .obs and/or .varm key where the annotation and/or profiles are stored in the reference. If None, it is inferred from reference, if possible.

  • counts_location – A string or tuple specifying where the count matrix is stored, e.g. ‘X’, (‘raw’,’X’), (‘raw’,’obsm’,’my_counts_key’), (‘layer’,’my_counts_key’), … For details see counts().

  • projection

    Projection method to use. Available are:

    • ’naive’: annotations are the matrix product of the gene frequencies in given by the count matrix and those in the annotation profiles

    • ’bc’: like ‘naive’ but using the probability amplitudes instead of probabilities/frequencies

    • ’h2’: identical to ‘bc’

    • ’bc2’: like ‘bc’ but squares the projection result, i.e. forms the expectation value from the overlap.

  • deconvolution

    Which method to use for deconvolution of the results based on the cross-projections of the annotation categories. If False, no deconvolution is done. If None, the best deconvolution is selected for every projection method:

    • ’linear’: special deconvolution for projection==’bc’ which only works for amplitudes, does not rely on (possibly slow) nnls and works with (fast) solution of a linear equation.

    • ’nnls’: general deconvolution for all other methods which works on probabilities, and therefore needs nnls.

Returns:

Returns the annotation in a DataFrame.