rapids_singlecell.pp.harmony_integrate

rapids_singlecell.pp.harmony_integrate#

rapids_singlecell.pp.harmony_integrate(adata, key, *, basis='X_pca', adjusted_basis='X_pca_harmony', dtype=<class 'numpy.float64'>, **kwargs)[source]#

Use harmonypy to integrate different experiments. Harmony is an algorithm for integrating single-cell data from multiple experiments. This function uses the python gpu-computing based port of Harmony, to integrate single-cell data stored in an AnnData object. As Harmony works by adjusting the principal components, this function should be run after performing PCA but before computing the neighbor graph.

Parameters:
adata AnnData

The annotated data matrix.

key str

The name of the column in adata.obs that differentiates among experiments/batches.

basis str (default: 'X_pca')

The name of the field in adata.obsm where the PCA table is stored. Defaults to 'X_pca', which is the default for sc.tl.pca().

adjusted_basis str (default: 'X_pca_harmony')

The name of the field in adata.obsm where the adjusted PCA table will be stored after running this function. Defaults to X_pca_harmony.

dtype type (default: <class 'numpy.float64'>)

The data type to use for the Harmony. If you use 32-bit you may experience numerical instability.

kwargs

Any additional arguments will be passed to harmonpy_gpu.run_harmony().

Return type:

None

Returns:

Updates adata with the field adata.obsm[adjusted_basis], containing principal components adjusted by Harmony such that different experiments are integrated.