Balasubramanian, Jeya Balaji
(2020)
Knowledge discovery with Bayesian Rule Learning methods for actionable biomedicine.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Discovery of precise biomarkers are crucial for improved clinical diagnostic, prognostic, and therapeutic decision-making. They help improve our understanding of the underlying physiological (and pathophysiological processes) within an individual. To discover precise biomarkers, we must take a personalized medical approach that accounts for an individual's unique clinical, genetic, omic, and environmental information. The molecular-level omic information provides an opportunity to understand complex physiological processes at an unprecedented resolution. The reducing costs and improvements in high-throughput technologies, which collect omic data from an individual, has now made it feasible to include a person's omic information as a standard component to their medical record. This information can only be clinically actionable if it is understandable to a clinician and applicable in the correct medical context. Biomarker discovery from omic data is challenging because they are— 1) high-dimensional, which increases the chance of false positive discoveries from traditional data mining methods; 2) most diseases are multifactorial, where many factors influence the disease outcome, making it challenging to be modeled by most data mining algorithms while keeping the model understandable to a clinician; and 3) traditional data mining methods discover only statistically significant biomarkers but do not account for clinical relevance, therefore they do not translate well in clinical practice.
In this dissertation, I formulate the problem of learning both statistically significant and clinically relevant biomarkers as a knowledge discovery problem. In computer science, knowledge discovery in databases is "a non-trivial process of the extraction of valid, novel, potentially useful, and ultimately understandable patterns in data". Clinical practice guidelines in decision support systems are often presented as explicit propositional logic rules because they are easy for a clinician to understand and are often actionable instructions themselves. Bayesian rule learning (BRL) is a rule-learning classifier that learns patterns as a set of probabilistic classification rules. I develop BRL search to efficiently learn from high-dimensional data. I study different BRL model representations to help obtain a robust set of rules that can encode context-specific independencies found in the data. To help efficiently model multifactorial diseases, I study various ensemble methods with BRL, collectively called Ensemble Bayesian Rule Learning (EBRL). I also develop a novel ensemble model visualization method called Bayesian Rule Ensemble Visualization tool (BREVity) to make EBRL more human-readable for a researcher or a clinician. I develop BRL with informative priors (BRLp) to enable BRL to incorporate prior domain knowledge into the model learning process, thereby further reducing the chance of discovering false positives. Finally, I develop BRL for knowledge discovery (BRL-KD) that can incorporate a clinical utility function to learn models that are clinically more relevant. Collectively, I use these BRL methods, developed for the task of biomarker discovery, as the knowledge engine of an intelligent clinical decision support system called Bayesian Rules for Actionable Informed Decisions or BRAID, a concept framework that can be deployed in clinical practice.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
23 January 2020 |
Date Type: |
Publication |
Defense Date: |
27 August 2019 |
Approval Date: |
23 January 2020 |
Submission Date: |
28 October 2019 |
Access Restriction: |
2 year -- Restrict access to University of Pittsburgh for a period of 2 years. |
Number of Pages: |
275 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Computing and Information > Intelligent Systems Program |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
rule learning, Bayesian methods, knowledge discovery, data mining |
Date Deposited: |
23 Jan 2020 20:12 |
Last Modified: |
23 Jan 2022 06:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/37733 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |