Gao, Xing
(2020)
Covariate-driven factorization by thresholding for multi-block data.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Multi-block data, where multiple groups of variables from different sources are observed for a common set of subjects, are routinely collected in many areas of science. Methods for joint factorization of such multi-block data are being developed to explore the potentially joint variation structure of the data. While most of the existing work focuses on delineating joint components, shared across all data blocks, from individual components, which is only relevant to a single data block, we propose to model and estimate partially-joint components across some, but not all, data blocks. If covariates, with potential multi-block structures, are available, then the components are further modeled to be driven by the covariate information. To estimate such a covariate-driven, block-structured factor model, we propose an iterative algorithm based on thresholding, by transforming the problem of signal segmentation into a grouped variable selection problem. The proposed factorization provides accurate estimation of individual and (partially) joint structures in multi-block data, as confirmed by simulation studies. In two real multi-block data sets from genomics and image analysis, we demonstrate that the estimated block structures facilitate easy interpretation of the major factors.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
8 June 2020 |
Date Type: |
Publication |
Defense Date: |
27 March 2020 |
Approval Date: |
8 June 2020 |
Submission Date: |
27 March 2020 |
Access Restriction: |
2 year -- Restrict access to University of Pittsburgh for a period of 2 years. |
Number of Pages: |
68 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Dietrich School of Arts and Sciences > Statistics |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Data integration; Factorization; Individual and joint variation extraction; Multi- block data decomposition; Principal component analysis; Supervised data decomposition. |
Date Deposited: |
08 Jun 2020 16:13 |
Last Modified: |
08 Jun 2022 05:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/38413 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
 |
View Item |