rapids_singlecell.dcg.waggr

Contents

rapids_singlecell.dcg.waggr#

rapids_singlecell.dcg.waggr = <rapids_singlecell.decoupler_gpu._helper._Method.Method object>[source]#

Weighted Aggregate (WAGGR).

This approach aggregates the molecular features \(x_i\) from one observation \(i\) with the feature weights \(w\) of a given feature set \(j\) into an enrichment score \(ES\).

This method can use any aggregation function, which by default is the weighted mean.

\[ES = \frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}\]

Another simpler option is the weighted sum.

\[ES = \sum_{i=1}^{n} w_i x_i\]

Alternatively, this method can also take any defined function \(f\) as long at it aggregates \(x_i\) and \(w\) into a single \(ES\).

\[ES = f(w_i, x_i)\]

This functionality makes it relatively easy to implement and try new enrichment methods.

When multiple random permutations are done (times > 1), statistical significance is assessed via empirical testing.

\[p_{value}=\frac{ES_{rand} \geq ES}{P}\]

Where:

  • \(ES_{rand}\) are the enrichment scores of the random permutations

  • \(P\) is the total number of permutations

Additionally, \(ES\) is updated to a normalized enrichment score \(NES\).

\[NES = \frac{ES - \mu(ES_{rand})}{\sigma(ES_{rand})}\]

Where:

  • \(\mu\) is the mean

  • \(\sigma\) is the standard deviation

Finally, the obtained \(p_{value}\) are adjusted by Benjamini-Hochberg correction.

Parameters:
data

AnnData instance, DataFrame or tuple of [matrix, samples, features].

net

Dataframe in long format. Must include source and target columns, and optionally a weight column.

tmin default: 5

Minimum number of targets per source. Sources with fewer targets will be removed.

layer

Layer key name of an anndata.AnnData instance.

raw default: False

Whether to use the .raw attribute of anndata.AnnData.

empty default: True

Whether to remove empty observations (rows) or features (columns).

bsize default: 5000

For large datasets in sparse format, this parameter controls how many observations are processed at once. Increasing this value speeds up computation but uses more memory.

verbose default: False

Whether to display progress messages and additional execution details.

pre_load default: False

Whether to pre-load the data into memory. If True, the data will be pre-loaded into memory before processing.

adj_pv_gpu default: False

Whether to use GPU for adjusting p-values.

fun

Function to compute enrichment statistic from omics readouts (x) and feature weights (w). Provided function must contain x and w arguments and output a single float. By default, ‘wmean’ and ‘wsum’ are implemented.

times

Number of random permutations to do.

seed

Random seed to use.

Returns:

Enrichment scores \(ES\) and, if applicable, adjusted \(p_{value}\) by Benjamini-Hochberg.

Example

import decoupler as dc

adata, net = dc.ds.toy()
dc.mt.waggr(adata, net, tmin=3)