Hansen, Casey E
(2022)
Classification and Representation of Biological Interactions in the Context of a Baseline Model.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Machine reading tools are able to quickly and automatically process vast amounts of information from relevant published literature, identifying and extracting the relevant information from a given paper or papers. This information can be used to build biological computational models or expand upon existing models. However, the information gleaned by machine readers is both vast and varied in quality. Machine readers must work to extract standardized biological interactions from inconsistent terminology and complex sentence structures, which sometimes leads to extraction errors. Here we present VIOLIN (Verifying Interactions of Likely Importance to the Network) a tool to automatically classify and judge biological interactions extracted from relevant literature. With VIOLIN, we are able to take these literature extracted events (LEEs) and compare them to an existing biological model, determining whether a given LEE agrees with the model (corroborates), introduces new information to the model (extends), disputes the model (contradicts), or requires manual review (flagged). Each LEE is assigned four numerical values to represent its relationship to the model system (Match Score), its classification category (Kind Score), its frequency (Evidence Score), and extraction confidence (Epistemic Value). These values are combined into a Total Score to allow for automatic filtering and classification of large sets of LEEs curated from multiple sources. We present VIOLIN in the context of five different models: melanoma, T-cell differentiation, the BDNF pathway as it relates to major depressive disorder, pancreatic cancer, and glioblastoma multiforme. These varied inputs show that VIOLIN has great utility across many biological systems, making it a powerful computational tool. We also show how VIOLIN integrates with other modeling tools as part of a larger model extensions framework. The goal of this work is to be able to automatically extend existing biological models using the vast amounts of relevant information already available.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
10 June 2022 |
Date Type: |
Publication |
Defense Date: |
10 December 2021 |
Approval Date: |
10 June 2022 |
Submission Date: |
4 February 2022 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
149 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Bioengineering |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Machine Learning, Computational Biology, Information Classification, Bioinformatics |
Date Deposited: |
10 Jun 2022 19:29 |
Last Modified: |
19 May 2023 13:51 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/42230 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |