Hsu, Ying-Feng
(2016)
Efficient Process Data Warehousing.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
This dissertation presents a data processing architecture for efficient data warehousing from historical data sources. The present work has three primary contributions. The first contribution is the development of a generalized process data warehousing (PDW) architecture that includes multilayer data processing steps to transform raw data streams into useful information that facilitates data-driven decision making. The second contribution is exploring the applicability of the proposed architecture to the case of sparse process data. We have tested the proposed approach in a medical monitoring system, which takes physiological data and predicts the clinical setting in which the data is most likely to be seen. We have performed a set of experiments with real clinical data (from Children’s Hospital of Pittsburgh) that demonstrate the high utility of the present approach. The third contribution is exploring the applicability of the proposed PDW architecture to the case of redundant process data. We have designed and developed a conflict-aware data fusion strategy for the efficient aggregation of historical data. We have elaborated a simulation-based study of the tradeoffs between the data fusion solutions and data accuracy, and have also evaluated the solutions to a large-scale integrated framework (Tycho data) that includes historical data from heterogeneous sources in different subject areas. Finally, we propose and have evaluated a state sequence recovery (SSR) framework, which integrates work from two previous studies, which are both sparse and redundant studies. Our experimental results are based on several algorithms that have been developed and tested in different simulation set-up scenarios under both normal and exponential data distributions.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
13 January 2016 |
Date Type: |
Publication |
Defense Date: |
13 May 2015 |
Approval Date: |
13 January 2016 |
Submission Date: |
4 December 2015 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
220 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Information Sciences > Information Science |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Data Warehouse, Data Fusion, Time-series Data Analyzing, Pattern Recognition, Machine Learning |
Date Deposited: |
13 Jan 2016 16:18 |
Last Modified: |
15 Nov 2016 14:31 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/26585 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |