Yu, Ke
(2024)
TOWARDS DATA-EFFICIENT LEARNING FOR MEDICINE.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
This is the latest version of this item.
Abstract
Deep learning has catalyzed significant advancements in the applications of artificial intelligence (AI) in medicine, extending across various modalities and data types, including small molecules, medical images, and electronic health records (EHRs). At the core of deep learning is representation learning, an automated process that uncovers patterns in data by mapping inputs to corresponding labels. However, the considerable cost associated with labeling medical data remains a major obstacle to the further development and implementation of deep learning algorithms for healthcare tasks.
In this thesis, we introduce three data-efficient learning algorithms, designed to capitalize on the abundance of existing medical data, which are predominantly unlabeled, semi-labeled or of multi-modal formats. Our first algorithm focuses on semi-supervised drug embedding and utilizes a medical knowledge base, specifically a drug taxonomy, as supervision to regularize the embedding space. This approach enables the localization of novel molecules within the context of drugs in the taxonomy, thereby facilitating inference of their pharmacological properties through retrieval of similar drugs from the embedding space. The second algorithm addresses self-supervised representation learning for three-dimensional (3D) medical images. By exploiting the recurrent and consistent anatomical structures found across different patient images, this method promotes the learning of anatomy-specific and disease-related features within the lung. Lastly, our third algorithm develops a weakly-supervised multi-modal representation learning framework for chest X-rays (CXR) and their corresponding radiology reports. By utilizing the rich contextual details embedded in reports, including the spatial and temporal relations between diseases and anatomical structures, the algorithm learns CXR representations that demonstrate effectiveness in disease detection, localization, and interval change classification. These proposed algorithms highlight the possibilities of effectively leveraging vast amounts of existing medical data, reducing the need for labor-intensive labeling and paving the way for scalable AI applications in healthcare.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
22 January 2024 |
Date Type: |
Publication |
Defense Date: |
7 August 2023 |
Approval Date: |
22 January 2024 |
Submission Date: |
28 November 2023 |
Access Restriction: |
1 year -- Restrict access to University of Pittsburgh for a period of 1 year. |
Number of Pages: |
169 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Computing and Information > Intelligent Systems Program |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Semi-supervised Learning, Self-supervised Learning, Weakly-supervised Learning, Positive Unlabeled Learning, Drug Embedding, Medial Imaging Analysis |
Date Deposited: |
22 Jan 2024 16:51 |
Last Modified: |
22 Jan 2024 16:51 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/45735 |
Available Versions of this Item
-
TOWARDS DATA-EFFICIENT LEARNING FOR MEDICINE. (deposited 22 Jan 2024 16:51)
[Currently Displayed]
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |