Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Clinical Text Analysis using Visual Analytics for Cancer Patient Cohort Identification

Saja, Al-alawneh (2021) Clinical Text Analysis using Visual Analytics for Cancer Patient Cohort Identification. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Download (4MB) | Preview


Due to the complexity nature of cancer patients’ records and clinical notes, extracting and summarizing the required data to identify a cohort of interest is a challenge for cancer researchers. DeepPhe pipeline was developed to support cancer cohort identification and hypothesis generation by extracting deep phenotypes from cancer patient’ electronic health records using natural language processing, rule-based and machine learning techniques. The pipeline generated high-level summaries from individual mentions as phenotypes and visualized them using a web-based visual analytics interface DeepPhe-Viz to create a longitude representation of cancer process.
In this study, we extended the functionalities of DeepPhe-Viz interface by combining the extracted data from the NLP pipeline and visual analytics to empower researcher’s ability to mine and uncover new and challenging insights about cancer population in the textual documents of the EMR data.
To advance the capabilities of the DeepPhe-Viz, first, we implemented an interactive heatmap visualization that viewed high-level representation of all the extracted terms from clinical documents. This feature enabled cancer researchers to investigate the rich contents of the full clinical document text to identify additional key variables such treatment that are extracted using the DeepPhe pipeline. Second, we implemented an interactive Sankey diagram visualization to aggregate all the transitions of predefined episodes of care for a cohort of cancer patient which represent temporal events in the clinical documents. This feature is essential for gaining a deeper understanding of different patterns and trends of cancer treatment mentioned in EMR.
Finally, we evaluated the usability of DeepPhe-Viz interface to identify a cohort of patients and to drill down to more details about patients. User studies results including qualitive and quantitative feedback indicated the usefulness and feasibility of the DeepPhe-Viz interface to the cancer investigators to conduct cancer retrospective studies using EMR data


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Saja, Al-alawnehsaa147@pitt.edusaa147
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairDouglas, Geraldgdouglas@pitt.edugdouglas
Thesis AdvisorHochheiser, Harrhharryh@pitt.eduharryh0000-0001-8793-9982
Committee MemberLee,
Committee MemberSliverstein, Jonathanj.c.s@pitt.eduj.c.s
Date: 29 October 2021
Date Type: Publication
Defense Date: 22 July 2021
Approval Date: 29 October 2021
Submission Date: 17 September 2021
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 147
Institution: University of Pittsburgh
Schools and Programs: School of Medicine > Biomedical Informatics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Natural language processing, cohort identification, visual analytics
Date Deposited: 29 Oct 2021 13:19
Last Modified: 16 Nov 2021 15:12


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item