Statistical Theory and Methodology for the Analysis of Microbial Compositions, with ApplicationsLin, Huang (2020) Statistical Theory and Methodology for the Analysis of Microbial Compositions, with Applications. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractIncreasingly researchers are finding associations between the microbiome and human diseases such as obesity, inflammatory bowel diseases, HIV, and so on. Determining what microbes are significantly different between conditions, known as differential abundance (DA) analysis, and depicting the dependence structure among them, are two of the most challenging and critical problems that have received considerable interest. It is well documented in the literature that the observed microbiome data are relative abundances with excess zeros. These data are necessarily compositional; hence conventional DA methods are not appropriate as they significantly inflate the false discovery rate (FDR), and the standard notion of correlation often results in spurious correlation. To overcome such difficulties, in this dissertation, we develop a general statistical framework that can address a broad collection of problems encountered by researchers. This dissertation work is organized as follows. In Chapter 1, we conduct a brief review of the literature of a variety of parameters used to characterize microbial composition. Specifically, we shall describe various concepts of diversity and differential taxa abundance. In Chapter 2, an off-set based regression model, called the Analysis of Composition of Microbiomes with Bias Correction (ANCOM-BC), is introduced. The ANCOM-BC model not only successfully controls the FDR at the desired level but also maintains high power. Simulations and real data analysis were conducted to compare the performance of ANCOM-BC with other commonly used algorithms. In Chapter 3, we extend ANCOM-BC for performing DA analysis when there are more than two ecosystems. We tested the method for a variety of alternative hypotheses. Similar simulation settings and real data were used to evaluate its performance. Lastly, in Chapter 4, we introduce a distance correlation based methodology, called Distance Correlation for Microbiome (DICOM), to untangle dependence structure among microbes within an ecosystem or across ecosystems (e.g., gut and oral microbiomes). PUBLIC HEALTH SIGNIFICANCE: This dissertation proposes a general statistical framework for studying microbial compositions. The identified differentially abundant taxa and the constructed dependence network could provide medical experts more knowledge of changes in patients' microbiome. This information could contribute to developing precision medicine for better patient care. Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|