scanpy-GPU#

These functions offer accelerated near drop-in replacements for common tools provided by scanpy.

Preprocessing pp#

Filtering of highly-variable genes, batch-effect correction, per-cell normalization.

Any transformation of the data matrix that is not a tool. Other than tools, preprocessing steps usually don’t return an easily interpretable annotation, but perform a basic transformation on the data matrix.

Basic Preprocessing#

pp.calculate_qc_metrics(adata, *[, ...])

Calculates basic qc Parameters.

pp.filter_cells(adata, *, qc_var[, ...])

Filter cell outliers based on counts and numbers of genes expressed.

pp.filter_genes(adata, *[, qc_var, ...])

Filter genes based on number of cells or counts.

pp.normalize_total(adata, *[, target_sum, ...])

Normalizes rows in matrix so they sum to target_sum

pp.log1p(adata, *[, layer, obsm, inplace, copy])

Calculated the natural logarithm of one plus the sparse matrix.

pp.highly_variable_genes(adata, *[, layer, ...])

Annotate highly variable genes.

pp.regress_out(adata, keys, *[, layer, ...])

Use linear regression to adjust for the effects of unwanted noise and variation.

pp.scale(adata, *[, zero_center, max_value, ...])

Scales matrix to unit variance and clips values

pp.pca(adata[, n_comps, layer, zero_center, ...])

Performs PCA using the cuml decomposition function.

pp.normalize_pearson_residuals(adata, *[, ...])

Applies analytic Pearson residual normalization, based on Lause21.

pp.flag_gene_family(adata, *, gene_family_name)

Flags a gene or gene_family in .var with boolean.

pp.filter_highly_variable(adata)

Filters the AnnData object for highly_variable genes.

Batch effect correction#

pp.harmony_integrate(adata, key, *[, basis, ...])

Use harmonypy to integrate different experiments.

Doublet detection#

pp.scrublet(adata[, adata_sim, batch_key, ...])

Predict doublets using Scrublet.

pp.scrublet_simulate_doublets(adata, *[, ...])

Simulate doublets by adding the counts of random observed transcriptome pairs.

Neighbors#

pp.neighbors(adata[, n_neighbors, n_pcs, ...])

Compute a neighborhood graph of observations with cuml.

Tools: tl#

tools offers tools for the accelerated processing of AnnData. For visualization use scanpy.pl.

Embedding#

tl.umap(adata, *[, min_dist, spread, ...])

Embed the neighborhood graph using UMAP's cuml implementation.

tl.tsne(adata[, n_pcs, use_rep, perplexity, ...])

Performs t-distributed stochastic neighborhood embedding (tSNE) using cuml library.

tl.diffmap(adata[, n_comps, neighbors_key, ...])

Diffusion maps has been proposed for visualizing single-cell data.

tl.draw_graph(adata, *[, init_pos, max_iter])

Force-directed graph drawing with cugraph's implementation of Force Atlas 2.

tl.mde(adata, *[, device, n_neighbors, ...])

Util to run pymde.preserve_neighbors() for visualization of single cell embeddings.

tl.embedding_density(adata[, basis, ...])

Calculate the density of cells in an embedding (per condition).

Clustering#

tl.louvain(adata[, resolution, restrict_to, ...])

Performs Louvain clustering using cuGraph, which implements the method described in:

tl.leiden(adata[, resolution, random_state, ...])

Performs Leiden clustering using cuGraph, which implements the method described in:

Marker genes#

tl.rank_genes_groups_logreg(adata, groupby, *)

Rank genes for characterizing groups.

Plotting#

For plotting please use scanpy’s plotting API scanpy.pl.