Bayesian variable selection model and differential co-expression network analysis for multi-omics data integrationZhu, Li (2019) Bayesian variable selection model and differential co-expression network analysis for multi-omics data integration. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractDue to the large accumulation of omics data sets in public repositories, innumerable studies have been designed to analyze omics data for various purposes. However, the analysis of single data set may only provide limited information or suffer from small sample size and lack of reproducibility, thus data integration is gaining more and more attention nowadays. This dissertation focuses on developing methods for variable selection in regression (Chapter 1) and clustering (Chapter 2) for multi-omics data integration, and identification of differential co-expression networks (Chapter 3) in the transcriptomics meta-analysis setting. In Chapter 1, we propose a Bayesian indicator variable selection model to incorporate multi-layer overlapping group structure (MOG) in the regression setting, motivated by the structure commonly encountered in multi-omics applications, in which a biological pathway contains tens to hundreds of genes and a gene can be mapped to multiple experimentally measured features (such as its mRNA expression, copy number variation and methylation levels of possibly multiple sites). We evaluated the model in simulations and two breast cancer examples, and demonstrated that this approach not only enhances prediction accuracy but also improves variable selection and model interpretation that lead to deeper biological insight into the disease. In Chapter 2, we extended MOG to Gaussian mixture models for clustering, aiming to identify disease subtypes and detect subtype-predictive omics features. In Chapter 3, we present a meta-analytic framework for detecting differential co-expression networks (MetaDCN). Differential co-expression (DC) analysis, different from conventional differential expression (DE) analysis, helps detect alterations of gene-gene correlations in case/control comparison, which is likely to be missed in DE analysis. Public health significance: Methods proposed in Chapter 1 - 2 not only can predict disease outcome or identify disease subtypes, but also determine relevant biomarkers, which can potentially facilitate the design of a test assay to monitor disease progression, predict disease subtypes, and guide treatment decisions. Method developed in Chapter 3 provides a novel framework for identifying differentially co-expressed genes to help us better understand how gene-gene interactions are altered in disease mechanism and provide potential new molecular targets for drug development. Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|