rapids_singlecell.dcg.run_mlm

Contents

rapids_singlecell.dcg.run_mlm#

rapids_singlecell.dcg.run_mlm(mat, net, *, source='source', target='target', weight='weight', batch_size=10000, min_n=5, verbose=False, use_raw=True)[source]#

Multivariate Linear Model (MLM). MLM fits a multivariate linear model for each sample, where the observed molecular readouts in mat are the response variable and the regulator weights in net are the covariates. Target features with no associated weight are set to zero. The obtained t-values from the fitted model are the activities (mlm_estimate) of the regulators in net.

Parameters:
mat AnnData | DataFrame | list

List of [features, matrix], dataframe (samples x features) or an AnnData instance.

net DataFrame

Network in long format.

source str (default: 'source')

Column name in net with source nodes.

target str (default: 'target')

Column name in net with target nodes.

weight str (default: 'weight')

Column name in net with weights.

batch_size int (default: 10000)

Size of the samples to use for each batch. Increasing this will consume more memory but it will run faster.

min_n int (default: 5)

Minimum of targets per source. If less, sources are removed.

verbose bool (default: False)

Whether to show progress.

use_raw bool (default: True)

Use raw attribute of mat if present.

Return type:

tuple | None

Returns:

Updates adata with the following fields.

estimateDataFrame

MLM scores. Stored in .obsm['mlm_estimate'] if mat is AnnData.

pvalsDataFrame

Obtained p-values. Stored in .obsm['mlm_pvals'] if mat is AnnData.