Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

An Integrated, Module-based Biomarker Discovery Framework

Huang, Grace T. (2014) An Integrated, Module-based Biomarker Discovery Framework. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (19MB) | Preview


Identification of biomarkers that contribute to complex human disorders is a principal and challenging task in computational biology. Prognostic biomarkers are useful for risk assessment of disease progression and patient stratification. Since treatment plans often hinge on patient stratification, better disease subtyping has the potential to significantly improve survival for patients. Additionally, a thorough understanding of the roles of biomarkers in cancer pathways facilitates insights into complex disease formation, and provides potential druggable targets in the pathways.
Many statistical methods have been applied toward biomarker discovery, often combining feature selection with classification methods. Traditional approaches are mainly concerned with statistical significance and fail to consider the clinical relevance of the selected biomarkers. Two additional problems impede meaningful biomarker discovery: gene multiplicity (several maximally predictive solutions exist) and instability (inconsistent gene sets from different experiments or cross validation runs).
Motivated by a need for more biologically informed, stable biomarker discovery method, I introduce an integrated module-based biomarker discovery framework for analyzing high- throughput genomic disease data. The proposed framework addresses the aforementioned challenges in three components. First, a recursive spectral clustering algorithm specifically
tailored toward high-dimensional, heterogeneous data (ReKS) is developed to partition genes into clusters that are treated as single entities for subsequent analysis. Next, the problems of gene multiplicity and instability are addressed through a group variable selection algorithm (T-ReCS) based on local causal discovery methods. Guided by the tree-like partition created from the clustering algorithm, this algorithm selects gene clusters that are predictive of a clinical outcome. We demonstrate that the group feature selection method facilitate the discovery of biologically relevant genes through their association with a statistically predictive driver. Finally, we elucidate the biological relevance of the biomarkers by leveraging available prior information to identify regulatory relationships between genes and between clusters, and deliver the information in the form of a user-friendly web server, mirConnX.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Huang, Grace
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorBenos, Panayiotis V.benos@pitt.eduBENOS
Committee MemberChennubhotla, Chakra Schakracs@pitt.eduCHAKRACS
Committee MemberBar-Joseph,
Committee MemberKaminski, Naftalinak38@pitt.eduNAK38
Committee MemberTsamardinos,
Committee ChairCooper, Gregory gfc@pitt.eduGFC
Date: 9 January 2014
Date Type: Publication
Defense Date: 30 September 2013
Approval Date: 9 January 2014
Submission Date: 16 December 2013
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 154
Institution: University of Pittsburgh
Schools and Programs: School of Medicine > Computational Biology
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Biomarker Discovery, Clustering, Spectral Clustering, Variable Selection, Markov Blanket, Group Variable Selection, Regulatory Networks, miRNA
Date Deposited: 09 Jan 2014 15:59
Last Modified: 15 Nov 2016 14:16


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item