Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Bayesian variable selection model and differential co-expression network analysis for multi-omics data integration

Zhu, Li (2019) Bayesian variable selection model and differential co-expression network analysis for multi-omics data integration. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Submitted Version

Download (2MB) | Preview


Due to the large accumulation of omics data sets in public repositories, innumerable studies have been designed to analyze omics data for various purposes. However, the analysis of single data set may only provide limited information or suffer from small sample size and lack of reproducibility, thus data integration is gaining more and more attention nowadays. This dissertation focuses on developing methods for variable selection in regression (Chapter 1) and clustering (Chapter 2) for multi-omics data integration, and identification of differential co-expression networks (Chapter 3) in the transcriptomics meta-analysis setting.

In Chapter 1, we propose a Bayesian indicator variable selection model to incorporate multi-layer overlapping group structure (MOG) in the regression setting, motivated by the structure commonly encountered in multi-omics applications, in which a biological pathway contains tens to hundreds of genes and a gene can be mapped to multiple experimentally measured features (such as its mRNA expression, copy number variation and methylation levels of possibly multiple sites). We evaluated the model in simulations and two breast cancer examples, and demonstrated that this approach not only enhances prediction accuracy but also improves variable selection and model interpretation that lead to deeper biological insight into the disease. In Chapter 2, we extended MOG to Gaussian mixture models for clustering, aiming to identify disease subtypes and detect subtype-predictive omics features.

In Chapter 3, we present a meta-analytic framework for detecting differential co-expression networks (MetaDCN). Differential co-expression (DC) analysis, different from conventional differential expression (DE) analysis, helps detect alterations of gene-gene correlations in case/control comparison, which is likely to be missed in DE analysis.

Public health significance: Methods proposed in Chapter 1 - 2 not only can predict disease outcome or identify disease subtypes, but also determine relevant biomarkers, which can potentially facilitate the design of a test assay to monitor disease progression, predict disease subtypes, and guide treatment decisions. Method developed in Chapter 3 provides a novel framework for identifying differentially co-expressed genes to help us better understand how gene-gene interactions are altered in disease mechanism and provide potential new molecular targets for drug development.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Zhu, Liliz86@pitt.eduliz86
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairTseng,
Committee MemberKrafty,
Committee MemberTang,
Committee MemberChen,
Committee MemberWeeks, Danielweeks@pitt.edu0000-0001-9410-7228
Date: 28 June 2019
Date Type: Publication
Defense Date: 9 April 2019
Approval Date: 28 June 2019
Submission Date: 4 April 2019
Access Restriction: 3 year -- Restrict access to University of Pittsburgh for a period of 3 years.
Number of Pages: 120
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Bayesian variable selection, multi-omics integration
Date Deposited: 28 Jun 2019 13:35
Last Modified: 30 Jun 2022 15:20


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item