rapids_singlecell.tl.umap#

rapids_singlecell.tl.umap(adata, *, min_dist=0.5, spread=1.0, n_components=2, maxiter=None, alpha=1.0, negative_sample_rate=5, init_pos='spectral', random_state=0, a=None, b=None, copy=False, neighbors_key=None)[source]#

Embed the neighborhood graph using UMAP’s cuml implementation.

UMAP (Uniform Manifold Approximation and Projection) is a manifold learning technique suitable for visualizing high-dimensional data. Besides tending to be faster than tSNE, it optimizes the embedding such that it best reflects the topology of the data, which we represent throughout rapids-singlecell using a neighborhood graph. tSNE, by contrast, optimizes the distribution of nearest-neighbor distances in the embedding such that these best match the distribution of distances in the high-dimensional space.

Parameters:

adata AnnData: Annotated data matrix.
min_dist float (default: 0.5): The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the spread value, which determines the scale at which embedded points will be spread out.
spread float (default: 1.0): The effective scale of embedded points. In combination with min_dist this determines how clustered/clumped the embedded points are.
n_components int (default: 2): The number of dimensions of the embedding.
maxiter int | None (default: None): The number of iterations (epochs) of the optimization. Called n_epochs in the original UMAP.
alpha float (default: 1.0): The initial learning rate for the embedding optimization.
negative_sample_rate int (default: 5): The number of negative edge/1-simplex samples to use per positive edge/1-simplex sample in optimizing the low dimensional embedding.
init_pos Literal['spectral', 'random'] (default: 'spectral'): How to initialize the low dimensional embedding. Called init in the original UMAP. Options are: * ‘spectral’: use a spectral embedding of the graph. * ‘random’: assign initial embedding positions at random.
random_state default: 0: int, random_state is the seed used by the random number generator
a float | None (default: None): More specific parameters controlling the embedding. If None these values are set automatically as determined by min_dist and spread.
b float | None (default: None): More specific parameters controlling the embedding. If None these values are set automatically as determined by min_dist and spread.
copy bool (default: False): Return a copy instead of writing to adata.
neighbors_key str | None (default: None): If not specified, umap looks .uns[‘neighbors’] for neighbors settings and .obsp[‘connectivities’] for connectivities (default storage places for pp.neighbors). If specified, umap looks .uns[neighbors_key] for neighbors settings and .obsp[.uns[neighbors_key][‘connectivities_key’]] for connectivities.

Return type:

AnnData | None

Returns:

Depending on copy, returns or updates adata with the following fields.

X_umapadata.obsm field: UMAP coordinates of data.

rapids_singlecell.tl.umap

Contents

rapids_singlecell.tl.umap#