A BAYESIAN APPROACH TO LEARNING DECISION TREES FOR PATIENT-SPECIFIC MODELS

Dutta-Moscato, Joyeeta (2018) A BAYESIAN APPROACH TO LEARNING DECISION TREES FOR PATIENT-SPECIFIC MODELS. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (2MB) | Preview

Abstract

A principal goal of precision medicine is to identify genomic factors that are predictive of outcomes in complex diseases, to provide better insight into their molecular mechanisms. Based on our current understanding, there are many genomic factors that are likely to be pathogenic in small subpopulations while being rare in the population as a whole. This research introduces a new machine learning method for discovering single nucleotide variants (SNVs), both common and rare, that in a given person are predictive of that person developing a disease or disease outcome.
The new method described in this research constructs decision tree models, uses a Bayesian score to evaluate the models, and employs a person-specific search strategy to identify SNVs that are predictive in a subpopulation whose members are similar to the person of interest. This method, called the Personalized Decision Tree Algorithm (PDTA), works by constructing a decision tree model from the data and then identifying a path in the tree that has excellent
prediction for the person of interest, or constructing a new path if none of the paths in the tree have excellent prediction.
The PDTA was refined iteratively on synthetic data and was experimentally evaluated on five datasets. One of the datasets was synthetic, one was semi-synthetic, and three were biological datasets collected from patients with chronic pancreatitis that included one small genomic dataset, a whole exome dataset, and a whole exome dataset focused on patients with diabetes in chronic pancreatitis. The performance of the method was evaluated using area under the Receiver Operating Characteristic curve and F1 score, as well as the ability to retrieve known and unknown rare SNVs. The PDTA was found to be effective to varying degrees in the datasets that were evaluated, creating parsimonious genetic representations for patient-specific groups, with the potential to discover novel variants.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Dutta-Moscato, Joyeeta	jod30@pitt.edu	jod30

ETD Committee:

Title	Member	Email Address	Pitt Username
Thesis Advisor	Visweswaran, Shyam	shv3@pitt.edu	shv3
Committee Member	Becich, Michael	becich@pitt.edu	becich
Committee Member	Lu, Xinghua	xinghua@pitt.edu	xinghua
Committee Member	Whitcomb, David	whitcomb@pitt.edu	whitcomb

Date:

28 August 2018

Date Type:

Publication

Defense Date:

2 August 2018

Approval Date:

28 August 2018

Submission Date:

28 August 2018

Access Restriction:

1 year -- Restrict access to University of Pittsburgh for a period of 1 year.

Number of Pages:

134

Institution:

University of Pittsburgh

Schools and Programs:

School of Medicine > Biomedical Informatics

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

rare variant, SNP discovery, BDeu, personalized medicine, precision medicine

Date Deposited:

28 Aug 2018 19:37

Last Modified:

28 Aug 2019 05:15

URI:

http://d-scholarship.pitt.edu/id/eprint/35269

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

A BAYESIAN APPROACH TO LEARNING DECISION TREES FOR PATIENT-SPECIFIC MODELS

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds