rapids_singlecell.ptg.Mixscale#

class rapids_singlecell.ptg.Mixscale[source]#

GPU-accelerated Mixscale for continuous perturbation-efficiency scoring.

Unlike Mixscape, which performs a binary knocked-out/non-perturbed classification with a Gaussian mixture, Mixscale assigns each cell a continuous perturbation-efficiency score. It follows Seurat’s Mixscale and pertpy’s Mixscale; the perturbation signature is computed via perturbation_signature() and the per-gene projection and z-score scoring run on the GPU.

Methods table#

mixscale(adata, pert_key, control, *[, ...])

Continuous perturbation efficiency scores (Mixscale).

perturbation_signature(adata, pert_key, ...)

Calculate the perturbation signature.

Methods#

mixscale#

Mixscale.mixscale(adata, pert_key, control, *, new_class_name='mixscale_score', layer=None, min_de_genes=5, max_de_genes=100, logfc_threshold=0.25, de_layer=None, test_method='wilcoxon', scale=True, split_by=None, pval_cutoff=0.05, perturbation_type='KO', copy=False)[source]#

Continuous perturbation efficiency scores (Mixscale).

Unlike mixscape(), which performs a binary knocked-out/non-perturbed classification with a Gaussian mixture, this assigns each cell a continuous perturbation-efficiency score: the scalar projection of its perturbation signature onto the per-gene perturbation direction (mean perturbed minus mean control), z-score standardized relative to the non-targeting control distribution. This is useful for CRISPRi/CRISPRa screens where cells show a gradient of perturbation strength rather than a binary knockout. Control cells receive a score of 0.

Implements Jiang, Dalgarno et al., “Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens”, Nature Cell Biology (2025), following pertpy’s mixscale().

Parameters:
adata AnnData

The annotated data object.

pert_key str

The column of .obs with target gene labels.

control str

Control category in pert_key.

new_class_name str (default: 'mixscale_score')

Name of the .obs column for the continuous score.

layer str | None (default: None)

Layer used for scoring. Defaults to .layers["X_pert"].

min_de_genes int (default: 5)

Minimum number of differentially expressed genes required to score a gene; genes with fewer are skipped.

max_de_genes int (default: 100)

Maximum number of (top-ranked) differentially expressed genes used to define the perturbation direction.

logfc_threshold float (default: 0.25)

Minimum absolute log fold change for a gene to count as differentially expressed.

de_layer str | None (default: None)

Layer used for differential expression. None uses .X.

test_method str (default: 'wilcoxon')

Differential-expression test passed to rapids_singlecell.tl.rank_genes_groups().

scale bool (default: True)

Scale the per-gene sub-matrix before computing scores.

split_by str | None (default: None)

.obs column with a condition/cell-type annotation if perturbations are condition specific.

pval_cutoff float (default: 0.05)

Adjusted p-value cutoff for differentially expressed genes.

perturbation_type str (default: 'KO')

Accepted for pertpy.tl.Mixscale.mixscale API compatibility; has no effect on the continuous score.

copy bool (default: False)

Whether to return a copy of adata.

Return type:

AnnData | None

Returns:

Returns the modified copy if copy=True, otherwise writes adata.obs[new_class_name] in place and returns None. Higher absolute values indicate a stronger perturbation effect; control cells and any gene that cannot be scored receive 0.

perturbation_signature#

Mixscale.perturbation_signature(adata, pert_key, control, *, ref_selection_mode='nn', split_by=None, n_neighbors=20, use_rep=None, n_dims=15, n_pcs=None, knn_algorithm='brute', knn_kwargs=None, copy=False)[source]#

Calculate the perturbation signature.

The perturbation signature replaces each cell’s expression with the residual against comparable control cells, removing confounding variation so that what remains reflects the perturbation. The result is written to adata.layers["X_pert"]. As in the original implementation, this is intended to run on unscaled log-normalized data.

Parameters:
adata AnnData

The annotated data object.

pert_key str

The column of .obs with perturbation categories; must also contain control.

control str

Name of the control category in pert_key.

ref_selection_mode Literal['nn', 'split_by'] (default: 'nn')

How reference cells are selected. "nn" uses the n_neighbors nearest control cells in the chosen representation; "split_by" uses all control cells within the same split_by group.

split_by str | None (default: None)

Column of .obs used to compute the signature separately per group (e.g. biological replicate). Required for ref_selection_mode="split_by".

n_neighbors int (default: 20)

Number of control neighbors used for ref_selection_mode="nn". Capped to the number of control cells available in each split, so a split with fewer controls than n_neighbors still runs (pertpy would error).

use_rep str | None (default: None)

Representation to use for neighbor selection. "X" or any .obsm key. If None, .X is used when n_vars is below 50, otherwise "X_pca" (computed if absent).

n_dims int | None (default: 15)

Number of dimensions of the representation to use. None uses all.

n_pcs int | None (default: None)

Number of principal components to compute if a PCA representation is built.

knn_algorithm str (default: 'brute')

Nearest-neighbor backend for ref_selection_mode="nn": "brute" (exact, default), or the approximate cuVS backends "ivfflat", "cagra", "ivfpq" which are much faster for large datasets.

knn_kwargs dict | None (default: None)

Extra parameters for the approximate backends (e.g. n_lists / n_probes for "ivfflat").

copy bool (default: False)

Whether to return a copy of adata.

Return type:

AnnData | None

Returns:

Returns the modified copy if copy=True, otherwise writes adata.layers["X_pert"] in place and returns None.