scanpy-GPU#

These functions offer accelerated near drop-in replacements for common tools provided by scanpy.

Preprocessing `pp`#

Filtering of highly-variable genes, batch-effect correction, per-cell normalization.

Any transformation of the data matrix that is not a tool. Other than tools, preprocessing steps usually don’t return an easily interpretable annotation, but perform a basic transformation on the data matrix.

Basic Preprocessing#

`pp.calculate_qc_metrics`(adata, *[, ...])	Calculates basic qc Parameters.
`pp.filter_cells`(adata, *, qc_var[, ...])	Filter cell outliers based on counts and numbers of genes expressed.
`pp.filter_genes`(adata, *[, qc_var, ...])	Filter genes based on number of cells or counts.
`pp.normalize_total`(adata, *[, target_sum, ...])	Normalizes rows in matrix so they sum to `target_sum`
`pp.log1p`(adata, *[, layer, obsm, inplace, copy])	Calculated the natural logarithm of one plus the sparse matrix.
`pp.highly_variable_genes`(adata, *[, layer, ...])	Annotate highly variable genes.
`pp.regress_out`(adata, keys, *[, layer, ...])	Use linear regression to adjust for the effects of unwanted noise and variation.
`pp.scale`(adata, *[, zero_center, max_value, ...])	Scales matrix to unit variance and clips values
`pp.pca`(adata[, n_comps, layer, zero_center, ...])	Performs PCA using the cuml decomposition function.
`pp.normalize_pearson_residuals`(adata, *[, ...])	Applies analytic Pearson residual normalization, based on Lause21.
`pp.flag_gene_family`(adata, *, gene_family_name)	Flags a gene or gene_family in .var with boolean.
`pp.filter_highly_variable`(adata)	Filters the `AnnData` object for highly_variable genes.

Batch effect correction#

pp.harmony_integrate(adata, key, *[, basis, ...])

Use harmonypy to integrate different experiments.

Doublet detection#

`pp.scrublet`(adata[, adata_sim, batch_key, ...])	Predict doublets using Scrublet.
`pp.scrublet_simulate_doublets`(adata, *[, ...])	Simulate doublets by adding the counts of random observed transcriptome pairs.

Neighbors#

pp.neighbors(adata[, n_neighbors, n_pcs, ...])

Compute a neighborhood graph of observations with cuml.

Tools: `tl`#

tools offers tools for the accelerated processing of AnnData. For visualization use scanpy.pl.

Embedding#

`tl.umap`(adata, *[, min_dist, spread, ...])	Embed the neighborhood graph using UMAP's cuml implementation.
`tl.tsne`(adata[, n_pcs, use_rep, perplexity, ...])	Performs t-distributed stochastic neighborhood embedding (tSNE) using cuml library.
`tl.diffmap`(adata[, n_comps, neighbors_key, ...])	Diffusion maps has been proposed for visualizing single-cell data.
`tl.draw_graph`(adata, *[, init_pos, max_iter])	Force-directed graph drawing with cugraph's implementation of Force Atlas 2.
`tl.mde`(adata, *[, device, n_neighbors, ...])	Util to run `pymde.preserve_neighbors()` for visualization of single cell embeddings.
`tl.embedding_density`(adata[, basis, ...])	Calculate the density of cells in an embedding (per condition).

Clustering#

`tl.louvain`(adata[, resolution, restrict_to, ...])	Performs Louvain clustering using cuGraph, which implements the method described in:
`tl.leiden`(adata[, resolution, random_state, ...])	Performs Leiden clustering using cuGraph, which implements the method described in:

Marker genes#

tl.rank_genes_groups_logreg(adata, groupby, *)

Rank genes for characterizing groups.

Plotting#

For plotting please use scanpy’s plotting API scanpy.pl.

scanpy-GPU

Contents