rapids_singlecell.dcg.waggr

rapids_singlecell.dcg.waggr#

rapids_singlecell.dcg.waggr = <rapids_singlecell.decoupler_gpu._helper._Method.Method object>[source]#

Weighted Aggregate (WAGGR).

This approach aggregates the molecular features \(x_i\) from one observation \(i\) with the feature weights \(w\) of a given feature set \(j\) into an enrichment score \(ES\).

This method can use any aggregation function, which by default is the weighted mean.

\[ES = \frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}\]

Another simpler option is the weighted sum.

\[ES = \sum_{i=1}^{n} w_i x_i\]

Alternatively, this method can also take any defined function \(f\) as long at it aggregates \(x_i\) and \(w\) into a single \(ES\).

\[ES = f(w_i, x_i)\]

This functionality makes it relatively easy to implement and try new enrichment methods.

When multiple random permutations are done (times > 1), statistical significance is assessed via empirical testing.

\[p_{value}=\frac{ES_{rand} \geq ES}{P}\]

Where:

\(ES_{rand}\) are the enrichment scores of the random permutations
\(P\) is the total number of permutations

Additionally, \(ES\) is updated to a normalized enrichment score \(NES\).

\[NES = \frac{ES - \mu(ES_{rand})}{\sigma(ES_{rand})}\]

Where:

\(\mu\) is the mean
\(\sigma\) is the standard deviation

Finally, the obtained \(p_{value}\) are adjusted by Benjamini-Hochberg correction.

Parameters:

data: AnnData instance, DataFrame or tuple of [matrix, samples, features].
net: Dataframe in long format. Must include source and target columns, and optionally a weight column.
tmin default: 5: Minimum number of targets per source. Sources with fewer targets will be removed.
layer: Layer key name of an anndata.AnnData instance.
raw default: False: Whether to use the .raw attribute of anndata.AnnData.
empty default: True: Whether to remove empty observations (rows) or features (columns).
bsize default: 5000: For large datasets in sparse format, this parameter controls how many observations are processed at once. Increasing this value speeds up computation but uses more memory.
verbose default: False: Whether to display progress messages and additional execution details.
pre_load default: False: Whether to pre-load the data into memory. If True, the data will be pre-loaded into memory before processing.
adj_pv_gpu default: False: Whether to use GPU for adjusting p-values.
fun: Function to compute enrichment statistic from omics readouts (x) and feature weights (w). Provided function must contain x and w arguments and output a single float. By default, ‘wmean’ and ‘wsum’ are implemented.
times: Number of random permutations to do.
seed: Random seed to use.

Returns:

Enrichment scores \(ES\) and, if applicable, adjusted \(p_{value}\) by Benjamini-Hochberg.

Example

import decoupler as dc

adata, net = dc.ds.toy()
dc.mt.waggr(adata, net, tmin=3)

rapids_singlecell.dcg.waggr

Contents

rapids_singlecell.dcg.waggr#