Skip to content

Learning cellular hierarchy from scRNAseq data

Notifications You must be signed in to change notification settings

komiloserdov/CATCH

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NOTES FOR THIS FORK

Fixing small type-related bug due to update in the numpy package (see commit: caef6ddcc759993bf453580b2bab30942252874a).

To install this fork please use:

pip install git+https://github.com/komiloserdov/CATCH

None of the other references or links in this readme were changed.

CATCH: Cellular Analysis with Topology and Condensation Homology

Learning cellular hierarchy from scRNAseq data

Cells occupy a hierarchy of transcriptional identities which is difficult to study in an unbiased manner when perturbed by disease. To identify, characterize, and compare clusters of cells, we present CATCH, a coarse graining framework that learns the cellular hierarchy by applying a deep cascade of manifold-intrinsic diffusion filters. CATCH includes a suite of tools based on the connection we forge between topological data analysis and data diffusion geometry to identify salient levels of the hierarchy, automatically characterize clusters and rapidly compute differentially expressed genes between clusters of interest. When used in conjunction with MELD (https://github.com/KrishnaswamyLab/MELD), CATCH has been shown to identify rare populations of pathogenic cells and create robust disease signatures.

Overview of Algorithm:

The key to thoroughly identifying and characterizing populations of cells affected by disease across granularities lies in the accurate computation of the cellular hierarchy. Current hierarchical clustering approaches applied to single cell analysis enforce global granularity constraints and provide only a few salient levels at which cellular groups can be found. This not only limits the discovery of rare disease-associated populations, but also requires computationally expensive differential expression analysis tools that produce diluted signatures of disease from unrefined clusters of cells. To address these limitations, we developed a novel topologically-inspired machine learning suite of tools called Cellular Analysis with Topology and Condensation Homology (CATCH).

At the center of this framework is diffusion condensation, a recently proposed data-diffusion based dynamic process for continuous graining of data through a deep cascade of graph diffusion filters. The algorithm iteratively pull points towards the weighted average of their graph diffusion neighbors, slowly eliminating variation until all data points converge:

alt text

To aid in single cell analysis tasks, we build a suite of tools around this coarse graining process:

  1. Visual summarization of condensation process through embedding of condensation homology (i);
  2. Identification of stable granularities for downstream analysis with topological activity analysis (ii);
  3. Integration with single-cell enrichment analysis tool MELD (Burkhardt et al. 2021) to identify disease-enriched populations of cells;
  4. Efficient comparison of clusters of cells via condensed transport to identify differentially expressed genes.

alt text

Getting Started

To install please use:

pip install git+https://github.com/KrishnaswamyLab/CATCH

For overview of functionality, please review our CATCH tutorial.

About

Learning cellular hierarchy from scRNAseq data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.7%
  • Python 1.3%