scembed.IntegrationEvaluator#

class scembed.IntegrationEvaluator(adata, embedding_key, batch_key='batch', cell_type_key='cell_type', ignore_cell_types=None, output_dir=None, baseline_embedding_key='X_pca_unintegrated')#

Evaluator for single-cell integration methods.

Evaluates integration quality using scIB metrics [LButtnerC+22] for benchmarking atlas-level data integration in single-cell genomics.

Methods table#

compute_and_show_embeddings([key_added, ...])

Compute and visualize UMAP embedding using scanpy [WAT18].

evaluate_scib([min_max_scale, use_faiss, ...])

Evaluate integration using scIB metrics [LButtnerC+22].

get_summary_metrics()

Get summary metrics from scIB evaluation.

Methods#

IntegrationEvaluator.compute_and_show_embeddings(key_added='X_umap', use_rapids=False, additional_colors=None, n_neighbors=15, **kwargs)#

Compute and visualize UMAP embedding using scanpy [WAT18].

Parameters:
  • key_added (str (default: 'X_umap')) – Key in .obsm for storing UMAP embedding.

  • use_rapids (bool (default: False)) – Whether to use rapids_singlecell for acceleration.

  • additional_colors (str | list[str] | None (default: None)) – Additional keys in .obs for coloring the UMAP plot. By default, we color in cell type and batch information.

  • n_neighbors (int (default: 15)) – Number of neighbors used for k-NN computation

  • kwargs (Any) – Additional keyword arguments for scanpy.pp.embedding

Return type:

None

IntegrationEvaluator.evaluate_scib(min_max_scale=False, use_faiss=False, subsample_to=None, subsample_strategy='naive', subsample_key=None, subset_to=None, bio_conservation_metrics=None, batch_correction_metrics=None)#

Evaluate integration using scIB metrics [LButtnerC+22].

Parameters:
  • min_max_scale (bool (default: False)) – Whether to apply min-max scaling to results.

  • use_faiss (bool (default: False)) – Whether to use FAISS GPU-accelerated nearest neighbor search.

  • subsample_to (int | None (default: None)) – If provided, subsample to this many cells before evaluation.

  • subsample_strategy (Literal['naive', 'proportional'] (default: 'naive')) – Strategy for subsampling when subsample_to is provided.

  • subsample_key (str | None (default: None)) – Key for proportional subsampling. If None, uses batch_key for proportional strategy.

  • subset_to (tuple[str, str | int | list[str | int]] | None (default: None)) – Tuple of a key in .obs and a list of categories to subset to.

  • bio_conservation_metrics (BioConservation | None (default: None)) – BioConservation metrics configuration. If None, defaults to BioConservation().

  • batch_correction_metrics (BatchCorrection | None (default: None)) – BatchCorrection metrics configuration. If None, defaults to BatchCorrection().

Return type:

None

IntegrationEvaluator.get_summary_metrics()#

Get summary metrics from scIB evaluation.

Return type:

dict[Any, Any]

Returns:

dict Dictionary with key metrics for logging.