tacco.tools.annotate_RCTD

annotate_RCTD(adata, reference, annotation_key, counts_location=None, conda_env=None, x_coord_name='x', y_coord_name='y', doublet=False, min_ct=0, UMI_min_sigma=None, Reference_n_max_cells=inf, Reference_min_UMI=0, RCTD_counts_MIN=0, n_cores=None, working_directory=None, verbose=True)[source]

Annotates an AnnData using reference data by RCTD [Cable21].

This is the direct interface to this annotation method. In practice using the general wrapper annotate() is recommended due to its higher flexibility.

Parameters:
  • adata – An AnnData including expression data in .X.

  • reference – Reference data to get the annotation definition from.

  • annotation_key – The .obs key where the annotation is stored in the reference. If None, it is inferred from reference, if possible.

  • counts_location – A string or tuple specifying where the count matrix is stored, e.g. ‘X’, (‘raw’,’X’), (‘raw’,’obsm’,’my_counts_key’), (‘layer’,’my_counts_key’), … For details see counts().

  • conda_env – The name or path of a conda environment where RCTD is installed and importable as ‘library(RCTD)’ or ‘library(spacexr)’. If None, uses the current environment.

  • x_coord_name – Name of an .obs column to forward to RCTD as x coordinates. If not available, forwards a new 0-only column.

  • y_coord_name – Name of an .obs column to forward to RCTD as y coordinates. If not available, forwards a new 0-only column.

  • doublet – Whether to run in “doublet” mode. Alternative is “full” mode.

  • min_ct – Minimum number of cells in a group to include in the RCTD run.

  • UMI_min_sigma – As default, RCTD has this value at 300, which is quite large for some datasets, and breaks RCTD. Therefore the default heuristic for None here is at min(300,median(total_counts_per_observation)-1). See RCTD docs for details about this parameter.

  • Reference_n_max_cells – In RCTD this number limits the number of cells used in the reference per celltype (by default to 10000) and subsamples them if more cells are available. For better transparency, this should be done already in python outside of RCTD. Therefore this parameter is set to infinity in the TACCO wrapper function by default, but can be set using this parameter.

  • Reference_min_UMI – In RCTD this number sets the minimum number of UMIs a cells must have to be used in the reference (by default 100). For transparency, this should be done already in python outside of RCTD. Therefore this parameter is set to 0 the TACCO wrapper function by default, but can be set using this parameter.

  • RCTD_counts_MIN – In RCTD this number sets the minimum number of counts (by default 10) in internally selected genes which a pixel needs to have in order to be kept. For transparency, this should be done already in python outside of RCTD. Therefore this parameter is set to 0 the TACCO wrapper function by default, but can be set using this parameter.

  • n_cores – Number of cores to use for RCTD. If None, use all available cores.

  • working_directory – The directory where to store all the intermediates. If None, a temporary directory is used and cleaned in the end. This option is probably only relevant for debugging.

  • verbose – Whether to print stderr and stdout of the RCTD run.

Returns:

Returns the annotation in a DataFrame.