rapids_singlecell.pp.normalize_pearson_residuals#
- rapids_singlecell.pp.normalize_pearson_residuals(adata, *, theta=100, clip=None, check_values=True, layer=None, inplace=True)[source]#
Applies analytic Pearson residual normalization, based on Lause21. The residuals are based on a negative binomial offset model with overdispersion
theta
shared across genes. By default, residuals are clipped tosqrt(n_obs)
and overdispersiontheta=100
is used.- Parameters:
- adata AnnData
AnnData object
- theta float (default:
100
) The negative binomial overdispersion parameter theta for Pearson residuals. Higher values correspond to less overdispersion (var = mean + mean^2/theta), and theta=np.Inf corresponds to a Poisson model.
- clip float | None (default:
None
) Determines if and how residuals are clipped: If None, residuals are clipped to the interval [-sqrt(n_obs), sqrt(n_obs)], where n_obs is the number of cells in the dataset (default behavior). If any scalar c, residuals are clipped to the interval [-c, c]. Set clip=np.Inf for no clipping.
- check_values bool (default:
True
) If True, checks if counts in selected layer are integers as expected by this function, and return a warning if non-integers are found. Otherwise, proceed without checking. Setting this to False can speed up code for large datasets.
- layer str | None (default:
None
) Layer to use as input instead of X. If None, X is used.
- inplace bool (default:
True
) If True, update AnnData with results. Otherwise, return results. See below for details of what is returned.
- Return type:
cp.ndarray | None
- Returns:
If
inplace=True
,adata.X
or the selected layer inadata.layers
is updated with the normalized values. Ifinplace=False
the normalized matrix is returned.