rapids_singlecell.dcg.zscore

rapids_singlecell.dcg.zscore#

rapids_singlecell.dcg.zscore = <rapids_singlecell.decoupler_gpu._helper._Method.Method object>[source]#

Z-score (ZSCORE).

This approach computes the mean value of the molecular features for known targets, optionally subtracts the overall mean of all measured features, and normalizes the result by the standard deviation of all features and the square root of the number of targets.

This formulation was originally introduced in KSEA, which explicitly includes the subtraction of the global mean to compute the enrichment score \(ES\).

\[ES = \frac{(\mu_s-\mu_p) \times \sqrt m }{\sigma}\]

Where:

\(\mu_s\) is the mean of targets
\(\mu_p\) is the mean of all features
\(m\) is the number of targets
\(\sigma\) is the standard deviation of all features

However, in the RoKAI implementation, this global mean subtraction was omitted.

\[ES = \frac{\mu_s \times \sqrt m }{\sigma}\]

A two-sided \(p_{value}\) is then calculated from the consensus score using the survival function \(sf\) of the standard normal distribution.

\[p = 2 \times \mathrm{sf}\bigl(\lvert \mathrm{ES} \rvert \bigr)\]

Finally, the obtained \(p_{value}\) are adjusted by Benjamini-Hochberg correction.

Parameters:

data: AnnData instance, DataFrame or tuple of [matrix, samples, features].
net: Dataframe in long format. Must include source and target columns, and optionally a weight column.
tmin default: 5: Minimum number of targets per source. Sources with fewer targets will be removed.
layer: Layer key name of an anndata.AnnData instance.
raw default: False: Whether to use the .raw attribute of anndata.AnnData.
empty default: True: Whether to remove empty observations (rows) or features (columns).
bsize default: 5000: For large datasets in sparse format, this parameter controls how many observations are processed at once. Increasing this value speeds up computation but uses more memory.
verbose default: False: Whether to display progress messages and additional execution details.
pre_load default: False: Whether to pre-load the data into memory. If True, the data will be pre-loaded into memory before processing.
adj_pv_gpu default: False: Whether to use GPU for adjusting p-values.
flavor: Which flavor to use when calculating the z-score, either KSEA or RoKAI.

Returns:

Enrichment scores \(ES\) and, if applicable, adjusted \(p_{value}\) by Benjamini-Hochberg.

Example

import decoupler as dc

adata, net = dc.ds.toy()
rsc.dcg.zscore(adata, net, tmin=3)

rapids_singlecell.dcg.zscore

Contents

rapids_singlecell.dcg.zscore#