Usage Principles#
Import#
import rapids_singlecell as rsc
Workflow#
The workflow of rapids-singlecell is basically the same as scanpy’s. The main difference is the speed at which rsc can analyze the data. For more information please checkout the notebooks and API documentation.
AnnData setup#
AnnData
supports GPU arrays and Sparse Matrices.
Rapids-singlecell leverages this capability to perform analyses directly on GPU-enabled AnnData
objects.
To get your AnnData
object onto the GPU you can set X
or each layers
to a GPU based matrix.
adata.X = cpx.scipy.sparse.csr_matrix(adata.X) # moves `.X` to the GPU
adata.X = adata.X.get() # moves `.X` back to the CPU
You can also use rapids_singlecell.get
to move arrays and matrices.
rsc.get.anndata_to_GPU(adata) # moves `.X` to the GPU
rsc.get.anndata_to_CPU(adata) # moves `.X` to the CPU
Preprocessing#
The preprocessing can be handled by the functions in pp
. They offer accelerated versions of functions within scanpy.pp
.
Example:
rsc.pp.highly_variable_genes(adata, n_top_genes=5000, flavor="seurat_v3", batch_key= "PatientNumber", layer = "counts")
adata = adata[:,adata.var["highly_variable"]==True]
rsc.pp.regress_out(adata,keys=["n_counts", "percent_MT"])
rsc.pp.scale(adata,max_value=10)
Tools#
The functions provided in tl
are designed to as near drop-in replacements for the functions in scanpy.tl
, but offer significantly improved performance. Consequently, you can continue to use scanpy’s plotting API.
Example:
rsc.tl.tsne(adata)
sc.pl.tsne(adata, color="leiden")
Decoupler-GPU#
dcg
offers accelerated drop in replacements for run_mlm()
and run_wsum()
Example:
import decoupler as dc
model = dc.get_progeny(organism='human', top=100)
rsc.dcg.run_mlm(mat=adata, net=net, source='source', target='target', weight='weight', verbose=True)
acts_mlm = dc.get_acts(adata, obsm_key='mlm_estimate')
sc.pl.umap(acts_mlm, color=['KLF5',"FOXA1", 'CellType'], cmap='coolwarm', vcenter=0)