rapids_singlecell.tl.kmeans

Contents

rapids_singlecell.tl.kmeans#

rapids_singlecell.tl.kmeans(adata, n_clusters=8, n_pcs=50, *, use_rep='X_pca', n_init=1, random_state=42, key_added='kmeans', copy=False, **kwargs)[source]#

KMeans is a basic but powerful clustering method which is optimized via Expectation Maximization. It randomly selects K data points in X, and computes which samples are close to these points. For every cluster of points, a mean is computed (hence the name), and this becomes the new centroid.

Parameters:
adata AnnData

Annotated data matrix.

n_clusters int (default: 8)

Number of clusters to compute

n_pcs int (default: 50)

Use this many PCs. If n_pcs==0 use .X if use_rep is None.

use_rep str (default: 'X_pca')

Use the indicated representation. 'X' or any key for .obsm is valid. If None, the representation is chosen automatically: For .n_vars < 50, .X is used, otherwise 'X_pca' is used. If 'X_pca' is not present, it’s computed with default parameters or n_pcs if present.

n_init int (default: 1)

Number of initializations to run the KMeans algorithm

random_state float (default: 42)

if you want results to be the same when you restart Python, select a state. Default is 42.

key_added str (default: 'kmeans')

adata.obs key under which to add the cluster labels.

copy bool (default: False)

Whether to copy adata or modify it in place.

**kwargs

Additional keyword arguments for KMeans.

Return type:

None