Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Biological Network Guided Variable Selection and Outcome Prediction for High-Dimensional Multi-Omics Data

Zhou, Xueping (2023) Biological Network Guided Variable Selection and Outcome Prediction for High-Dimensional Multi-Omics Data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img] PDF
Restricted to University of Pittsburgh users only until 24 August 2024.

Download (3MB) | Request a Copy


Developing efficient feature selection and accurate outcome prediction algorithms is a major and often difficult task in analyzing high-dimensional data. This dissertation focuses on grouping and correlation structure guided feature selection and outcome prediction for data with potentially high-dimensional predictors with applications in multi-omics data.
In Chapter 2, we propose a novel feature selection and prediction algorithm for binary outcomes, which is called local-network guided logistic regression. Our method adopts a multivariate screening procedure by integrating the inter-feature correlations into the feature selection process. In Chapter 3, we propose a novel supervised learning algorithm to perform feature selection and multivariate outcome prediction for data with potentially high-dimensional predictors and responses. The method incorporates known genome hierarchy grouping and correlation structures into feature selection, regression coefficient estimation, and outcome prediction under a penalized multivariate multiple linear regression model. In Chapter 4, within the framework of Fisher's discriminant analysis, we propose a binary classification method which incorporates variable selection for high-dimensional predictors. The proposed method can select both the differential expressed genes and the differential connected genes between different biological states. In Chapter 5, we propose a novel feature selection and prediction pipeline for binary outcomes with application in age-related macular degeneration, which is a progressive neurodegenerative disease and the leading cause of blindness in developed countries.
Contribution to public health: The dissertation proposes several methods for feature selection and outcome prediction using high-dimensional omics data. 1) Weak signal detection for outcome classification. 2) Genome grouping structure and correlation guided feature selection for multivariate outcome prediction. 3) Differential connected feature detection for outcome classification. All of the proposed methods in this dissertation are valuable tools to select outcome related features which help to uncover health-related mechanisms as well as predicting disease status or health-related outcomes of interest.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Zhou, Xuepingxuz37@pitt.eduxuz37
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairWei, Chenwei.chen@pitt.eduwei.chen
Committee MemberLi,
Committee MemberDing, YingYINGDING@pitt.eduYINGDING
Committee MemberTang, LuLUTANG@pitt.eduLUTANG
Committee MemberForno,
Date: 24 August 2023
Date Type: Publication
Defense Date: 27 July 2023
Approval Date: 24 August 2023
Submission Date: 14 June 2023
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 133
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Association study; Biomarker discovery; Cell-type deconvolution; DNA methylation; Feature selection; Grouping structure; High-dimensional data; Multivariate regression; Prediction, Weak signal detection
Date Deposited: 24 Aug 2023 13:22
Last Modified: 24 Aug 2023 13:22


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item