Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers

Kieran Campbell, Christopher Yau

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
Original languageEnglish
JournalWellcome Open Research
Volume2
Issue number19
Early online date15 Mar 2017
DOIs
Publication statusE-pub ahead of print - 15 Mar 2017

Fingerprint

Dive into the research topics of 'Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers'. Together they form a unique fingerprint.

Cite this