rapids_singlecell.dcg.zscore

Contents

rapids_singlecell.dcg.zscore#

rapids_singlecell.dcg.zscore = <rapids_singlecell.decoupler_gpu._helper._Method.Method object>[source]#

Z-score (ZSCORE).

This approach computes the mean value of the molecular features for known targets, optionally subtracts the overall mean of all measured features, and normalizes the result by the standard deviation of all features and the square root of the number of targets.

This formulation was originally introduced in KSEA, which explicitly includes the subtraction of the global mean to compute the enrichment score \(ES\).

\[ES = \frac{(\mu_s-\mu_p) \times \sqrt m }{\sigma}\]

Where:

  • \(\mu_s\) is the mean of targets

  • \(\mu_p\) is the mean of all features

  • \(m\) is the number of targets

  • \(\sigma\) is the standard deviation of all features

However, in the RoKAI implementation, this global mean subtraction was omitted.

\[ES = \frac{\mu_s \times \sqrt m }{\sigma}\]

A two-sided \(p_{value}\) is then calculated from the consensus score using the survival function \(sf\) of the standard normal distribution.

\[p = 2 \times \mathrm{sf}\bigl(\lvert \mathrm{ES} \rvert \bigr)\]

Finally, the obtained \(p_{value}\) are adjusted by Benjamini-Hochberg correction.

Parameters:
data

AnnData instance, DataFrame or tuple of [matrix, samples, features].

net

Dataframe in long format. Must include source and target columns, and optionally a weight column.

tmin default: 5

Minimum number of targets per source. Sources with fewer targets will be removed.

layer

Layer key name of an anndata.AnnData instance.

raw default: False

Whether to use the .raw attribute of anndata.AnnData.

empty default: True

Whether to remove empty observations (rows) or features (columns).

bsize default: 5000

For large datasets in sparse format, this parameter controls how many observations are processed at once. Increasing this value speeds up computation but uses more memory.

verbose default: False

Whether to display progress messages and additional execution details.

pre_load default: False

Whether to pre-load the data into memory. If True, the data will be pre-loaded into memory before processing.

adj_pv_gpu default: False

Whether to use GPU for adjusting p-values.

flavor

Which flavor to use when calculating the z-score, either KSEA or RoKAI.

Returns:

Enrichment scores \(ES\) and, if applicable, adjusted \(p_{value}\) by Benjamini-Hochberg.

Example

import decoupler as dc

adata, net = dc.ds.toy()
rsc.dcg.zscore(adata, net, tmin=3)