next on phyloseminar.org
Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data
Recent advances in single-cell methods have made tangible how individual cell profiles can reflect the imprint of ephemeral or dynamic processes. However, synthesizing this information to reconstruct dynamic biological phenomena – from data that are noisy, heterogenous, and sparse, and from processes that may unfold asynchronously – poses a computational and statistical challenge.
We develop a full generative model and inference for reconstructing a dynamic process (cellular differentiation) from many static snapshots (single-cell RNA-seq profiles), with calibrated uncertainties. Specifically, we define cell state by the latent parameterization of a distribution over gene expression space, and model these latent vectors as arising from bifurcating, self-reinforcing paths along a probabilistic tree — necessitating the design of a new class of Bayesian tree models for data that arise from a latent branching spectrum.
In this talk, I explore how our model fills a hole in the existing literature on probabilistic trees, and what having an explicit generative model buys us in the context of reconstructing trajectories to understand cell fate decisions in differentiation.
Cellular ‘phylogenetics’ - decoding the developmental history and relationships among individual cells
Multicellular organisms develop by way of a lineage tree, a series of cell divisions that give rise to cell types, tissues, and organs. This pattern mirrors the evolutionary relationships between species, though our knowledge of the cell lineage and its determinants remains extremely fragmentary for nearly all species. This includes all vertebrates and arthropods such as Drosophila, wherein cell lineage varies between individuals. Embryos and organs are often visually inaccessible, and progenitor cells disperse by long-distance migration. We recently pioneered a new paradigm for recording cell lineage and other aspects of developmental history that has the potential to enhance our understanding of vertebrate biology. In brief, we engineer cells to stochastically introduce mutations at specific locations in the genome during development. The resulting patterns of mutations, which can be efficiently queried by massively parallel sequencing, can be used to reconstruct lineage using methods adapted from phylogenetics. We demonstrate our technique by tracing the lineage of tens of thousands of cells within individual Zebrafish and Drosophila, relating the lineage of numerous emerging tissue and organ systems.