tacco.tools.mix_in_silico¶
- mix_in_silico(adata, type_key=None, topic_key=None, n_samples=30000, bead_shape=0.1, bead_size=1.0, norm_cells=False, platform_log10_mean=None, platform_log10_std=0.6, seed=42, round=True, min_counts=100, capture_rate=1.0)[source]¶
Given single cell data, create an in-silico mixed dataset. The mixtures are generated by placing the cells randomly in space, placing measurement points (“beads”) randomly in space, and convoluting them with some spatial profile, e.g. a gaussian. Optionally also applies a random log-laplace distributed rescaling per gene.
- Parameters:
adata – An
AnnData
with annotation in .obs.type_key – An .obs key with categorical information to propagate through to the mixed data, e.g. cell types.
topic_key – An .obsm key with continuous information to propagate through to the mixed data, e.g. transciptional topics.
n_samples – The number of measurement points (“beads”) which are put randomly in space. Note that depending on min_counts and the mixing parameters the number of returned measurement points is somewhat smaller than this value.
bead_shape –
The shape to use for determining the contributions of cells to “beads”. Can also be a list of shapes to save setup time wrt. isolated calls. Possible values:
’gauss’: weights decrease with distance like a gaussian.
’disc’: weights are constant until some distance and then drop to 0.
number: weights decrease with distance according to a tanh-profile with the sharpness of the decrease given by this number. It can be used to interpolates between 0 (disc-like) and 1 (gauss-like).
bead_size – Scaling factor determining the effetive size of the beads/profile. A value of 1 corresponds to tightly packed cells and beads of the size of a cell.
norm_cells – Whether to normalize the total counts per cell in the single cell data prior to mixing.
platform_log10_mean – log10 of the mean of the Laplace distribution for Log-Laplace distributed per gene platform effect. The per-gene factors are available in .var[‘platform_effect’]. If None, no platform factors are applied.
platform_log10_std – log10 of the standard deviation of the Laplace distribution for Log-Laplace distributed per gene platform effect
seed – The random seed to use
round – Whether to round the resulting expression matrix to integer counts after rescaling
min_counts – The returned adata is filtered to have at least this number of counts per observation. If None, return all observations.
capture_rate – The fraction of counts to keep from a cell with maximum coverage from the bead. If ‘normalized’, normalize weights per bead to sum to 1. If None, normalize the bead psf to 1.
- Returns:
Returns the mixed data as
AnnData
. If beadshape is a list, returns a dictionary containing aAnnData
per beadshape.