MOTIVATION: Existing microbiome-based disease prediction relies on the ability of machine learning methods to differentiate disease from healthy subjects based on the observed taxa abundance across samples. Despite numerous microbes have been implicated as potential biomarkers, challenges remain due to not only the statistical nature of microbiome data, but also the lack of understanding of microbial interactions which can be indicative of the disease.
RESULTS: We propose CACONET (classification of Compositional-Aware COrrelation NETworks), a computational framework that learns to classify microbial correlation networks and extracts potential signature interactions, taking as input taxa relative abundance across samples and their health status. By using Bayesian compositional-aware correlation inference, a collection of posterior correlation networks can be drawn and used for graph-level classification, thus incorporating uncertainty in the estimates. CACONET then employs a deep learning approach for graph classification, achieving excellent performance metrics by exploiting the correlation structure. We test the framework on both simulated data and a large real-world dataset pertaining to microbiome samples of colorectal cancer (CRC) and healthy subjects, and identify potential network substructure characteristic of CRC microbiota. CACONET is customizable and can be adapted to further improve its utility.
AVAILABILITY: CACONET is available at https://github.com/yuanwxu/corr-net-classify.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.