rapids_singlecell.tl.umap

Contents

rapids_singlecell.tl.umap#

rapids_singlecell.tl.umap(adata, *, min_dist=0.5, spread=1.0, n_components=2, maxiter=None, alpha=1.0, negative_sample_rate=5, init_pos='spectral', random_state=0, a=None, b=None, copy=False, neighbors_key=None)[source]#

Embed the neighborhood graph using UMAP’s cuml implementation.

UMAP (Uniform Manifold Approximation and Projection) is a manifold learning technique suitable for visualizing high-dimensional data. Besides tending to be faster than tSNE, it optimizes the embedding such that it best reflects the topology of the data, which we represent throughout rapids-singlecell using a neighborhood graph. tSNE, by contrast, optimizes the distribution of nearest-neighbor distances in the embedding such that these best match the distribution of distances in the high-dimensional space.

Parameters:
adata AnnData

Annotated data matrix.

min_dist float (default: 0.5)

The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the spread value, which determines the scale at which embedded points will be spread out.

spread float (default: 1.0)

The effective scale of embedded points. In combination with min_dist this determines how clustered/clumped the embedded points are.

n_components int (default: 2)

The number of dimensions of the embedding.

maxiter int | None (default: None)

The number of iterations (epochs) of the optimization. Called n_epochs in the original UMAP.

alpha float (default: 1.0)

The initial learning rate for the embedding optimization.

negative_sample_rate int (default: 5)

The number of negative edge/1-simplex samples to use per positive edge/1-simplex sample in optimizing the low dimensional embedding.

init_pos Literal['spectral', 'random'] (default: 'spectral')

How to initialize the low dimensional embedding. Called init in the original UMAP. Options are: * ‘spectral’: use a spectral embedding of the graph. * ‘random’: assign initial embedding positions at random.

random_state default: 0

int, random_state is the seed used by the random number generator

a float | None (default: None)

More specific parameters controlling the embedding. If None these values are set automatically as determined by min_dist and spread.

b float | None (default: None)

More specific parameters controlling the embedding. If None these values are set automatically as determined by min_dist and spread.

copy bool (default: False)

Return a copy instead of writing to adata.

neighbors_key str | None (default: None)

If not specified, umap looks .uns[‘neighbors’] for neighbors settings and .obsp[‘connectivities’] for connectivities (default storage places for pp.neighbors). If specified, umap looks .uns[neighbors_key] for neighbors settings and .obsp[.uns[neighbors_key][‘connectivities_key’]] for connectivities.

Return type:

AnnData | None

Returns:

Depending on copy, returns or updates adata with the following fields.

X_umapadata.obsm field

UMAP coordinates of data.