rapids_singlecell.pp.harmony_integrate#
- rapids_singlecell.pp.harmony_integrate(adata, key, *, basis='X_pca', adjusted_basis='X_pca_harmony', dtype=<class 'numpy.float64'>, **kwargs)[source]#
Use harmonypy to integrate different experiments. Harmony is an algorithm for integrating single-cell data from multiple experiments. This function uses the python gpu-computing based port of Harmony, to integrate single-cell data stored in an AnnData object. As Harmony works by adjusting the principal components, this function should be run after performing PCA but before computing the neighbor graph.
- Parameters:
- adata
AnnData
The annotated data matrix.
- key
str
The name of the column in
adata.obs
that differentiates among experiments/batches.- basis
str
(default:'X_pca'
) The name of the field in
adata.obsm
where the PCA table is stored. Defaults to'X_pca'
, which is the default forsc.tl.pca()
.- adjusted_basis
str
(default:'X_pca_harmony'
) The name of the field in
adata.obsm
where the adjusted PCA table will be stored after running this function. Defaults toX_pca_harmony
.- dtype
type
(default:<class 'numpy.float64'>
) The data type to use for the Harmony. If you use 32-bit you may experience numerical instability.
- kwargs
Any additional arguments will be passed to
harmonpy_gpu.run_harmony()
.
- adata
- Return type:
- Returns:
Updates adata with the field
adata.obsm[adjusted_basis]
, containing principal components adjusted by Harmony such that different experiments are integrated.