Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Knowledge discovery with Bayesian Rule Learning methods for actionable biomedicine

Balasubramanian, Jeya Balaji (2020) Knowledge discovery with Bayesian Rule Learning methods for actionable biomedicine. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

PDF (Dissertation PDF)
Updated Version

Download (2MB) | Preview


Discovery of precise biomarkers are crucial for improved clinical diagnostic, prognostic, and therapeutic decision-making. They help improve our understanding of the underlying physiological (and pathophysiological processes) within an individual. To discover precise biomarkers, we must take a personalized medical approach that accounts for an individual's unique clinical, genetic, omic, and environmental information. The molecular-level omic information provides an opportunity to understand complex physiological processes at an unprecedented resolution. The reducing costs and improvements in high-throughput technologies, which collect omic data from an individual, has now made it feasible to include a person's omic information as a standard component to their medical record. This information can only be clinically actionable if it is understandable to a clinician and applicable in the correct medical context. Biomarker discovery from omic data is challenging because they are— 1) high-dimensional, which increases the chance of false positive discoveries from traditional data mining methods; 2) most diseases are multifactorial, where many factors influence the disease outcome, making it challenging to be modeled by most data mining algorithms while keeping the model understandable to a clinician; and 3) traditional data mining methods discover only statistically significant biomarkers but do not account for clinical relevance, therefore they do not translate well in clinical practice.

In this dissertation, I formulate the problem of learning both statistically significant and clinically relevant biomarkers as a knowledge discovery problem. In computer science, knowledge discovery in databases is "a non-trivial process of the extraction of valid, novel, potentially useful, and ultimately understandable patterns in data". Clinical practice guidelines in decision support systems are often presented as explicit propositional logic rules because they are easy for a clinician to understand and are often actionable instructions themselves. Bayesian rule learning (BRL) is a rule-learning classifier that learns patterns as a set of probabilistic classification rules. I develop BRL search to efficiently learn from high-dimensional data. I study different BRL model representations to help obtain a robust set of rules that can encode context-specific independencies found in the data. To help efficiently model multifactorial diseases, I study various ensemble methods with BRL, collectively called Ensemble Bayesian Rule Learning (EBRL). I also develop a novel ensemble model visualization method called Bayesian Rule Ensemble Visualization tool (BREVity) to make EBRL more human-readable for a researcher or a clinician. I develop BRL with informative priors (BRLp) to enable BRL to incorporate prior domain knowledge into the model learning process, thereby further reducing the chance of discovering false positives. Finally, I develop BRL for knowledge discovery (BRL-KD) that can incorporate a clinical utility function to learn models that are clinically more relevant. Collectively, I use these BRL methods, developed for the task of biomarker discovery, as the knowledge engine of an intelligent clinical decision support system called Bayesian Rules for Actionable Informed Decisions or BRAID, a concept framework that can be deployed in clinical practice.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Balasubramanian, Jeya Balajijeya@pitt.edujeya0000-0002-0025-8410
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairGopalakrishnan, Vanathivanathi@pitt.eduvanathi0000-0002-7813-4055
Committee MemberCooper, Gregorygfc@pitt.edugfc0000-0003-1687-6202
Committee MemberVisweswaran, Shyamshv3@pitt.edushv30000-0002-2079-8684
Committee MemberReis, Stevensreis@pitt.edusreis0000-0001-8023-0102
Date: 23 January 2020
Date Type: Publication
Defense Date: 27 August 2019
Approval Date: 23 January 2020
Submission Date: 28 October 2019
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 275
Institution: University of Pittsburgh
Schools and Programs: School of Computing and Information > Intelligent Systems Program
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: rule learning, Bayesian methods, knowledge discovery, data mining
Date Deposited: 23 Jan 2020 20:12
Last Modified: 23 Jan 2022 06:15


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item