LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES

Jordan, Rick (2016) LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (6MB)

Abstract

Genomic, proteomic and other experimentally generated data from studies of biological systems aiming to discover disease biomarkers are currently analyzed without sufficient supporting evidence from the literature due to complexities associated with automated processing. Extracting prior knowledge about markers associated with biological sample types and disease states from the literature is tedious, and little research has been performed to understand how to use this knowledge to inform the generation of classification models from ‘omic’ data. Using pathway analysis methods to better understand the underlying biology of complex diseases such as breast and lung cancers is state-of-the-art. However, the problem of how to combine literature-mining evidence with pathway analysis evidence is an open problem in biomedical informatics research.
This dissertation presents a novel semi-automated framework, named Knowledge Enhanced Data Analysis (KEDA), which incorporates the following components: 1) literature mining of text; 2) classification modeling; and 3) pathway analysis. This framework aids researchers in assigning literature-mining-based prior knowledge values to genes and proteins associated with disease biology. It incorporates prior knowledge into the modeling of experimental datasets, enriching the development process with current findings from the scientific community.
New knowledge is presented in the form of lists of known disease-specific biomarkers and their accompanying scores obtained through literature mining of millions of lung and breast cancer abstracts. These scores can subsequently be used as prior knowledge values in Bayesian modeling and pathway analysis. Ranked, newly discovered biomarker-disease-biofluid relationships which identify biomarker specificity across biofluids are presented. A novel method of identifying biomarker relationships is discussed that examines the attributes from the best-performing models. Pathway analysis results from the addition of prior information, ultimately lead to more robust evidence for pathway involvement in diseases of interest based on statistically significant standard measures of impact factor and p-values.
The outcome of implementing the KEDA framework is enhanced modeling and pathway analysis findings. Enhanced knowledge discovery analysis leads to new disease-specific entities and relationships that otherwise would not have been identified. Increased disease understanding, as well as identification of biomarkers for disease diagnosis, treatment, or therapy targets should ultimately lead to validation and clinical implementation.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Jordan, Rick	rmj12@pitt.edu

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Member	Visweswaran, Shyam	shv3@pitt.edu	SHV3
Committee Member	Jacobson, Rebecca	rebeccaj@pitt.edu
Committee Member	Lu, Songjian	songjian@pitt.edu
Thesis Advisor	Gopalakrishnan, Vanathi	vanathi@pitt.edu

Date:

23 May 2016

Date Type:

Publication

Defense Date:

2 December 2015

Approval Date:

23 May 2016

Submission Date:

20 May 2016

Access Restriction:

5 year -- Restrict access to University of Pittsburgh for a period of 5 years.

Number of Pages:

227

Institution:

University of Pittsburgh

Schools and Programs:

School of Medicine > Biomedical Informatics

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

literature mining, text-mining, pathway analysis, Bayesian modeling, lung cancer, breast cancer

Date Deposited:

23 May 2016 14:15

Last Modified:

23 May 2021 05:15

URI:

http://d-scholarship.pitt.edu/id/eprint/28062

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds