Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Covariate-driven factorization by thresholding for multi-block data

Gao, Xing (2020) Covariate-driven factorization by thresholding for multi-block data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Download (4MB) | Preview

Abstract

Multi-block data, where multiple groups of variables from different sources are observed for a common set of subjects, are routinely collected in many areas of science. Methods for joint factorization of such multi-block data are being developed to explore the potentially joint variation structure of the data. While most of the existing work focuses on delineating joint components, shared across all data blocks, from individual components, which is only relevant to a single data block, we propose to model and estimate partially-joint components across some, but not all, data blocks. If covariates, with potential multi-block structures, are available, then the components are further modeled to be driven by the covariate information. To estimate such a covariate-driven, block-structured factor model, we propose an iterative algorithm based on thresholding, by transforming the problem of signal segmentation into a grouped variable selection problem. The proposed factorization provides accurate estimation of individual and (partially) joint structures in multi-block data, as confirmed by simulation studies. In two real multi-block data sets from genomics and image analysis, we demonstrate that the estimated block structures facilitate easy interpretation of the major factors.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Gao, Xingxig32@pitt.eduxig32
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairIyengar, Satishssi@pitt.eduSSI
Committee CoChairJung, Sungkyusungkyu@snu.ac.kr
Committee MemberLi, Gengl2521@cumc.columbia.edu
Committee MemberCheng, Yuyucheng@pitt.eduYUCHENG
Committee MemberRen, Zhaozren@pitt.eduZREN
Date: 8 June 2020
Date Type: Publication
Defense Date: 27 March 2020
Approval Date: 8 June 2020
Submission Date: 27 March 2020
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 68
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Statistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Data integration; Factorization; Individual and joint variation extraction; Multi- block data decomposition; Principal component analysis; Supervised data decomposition.
Date Deposited: 08 Jun 2020 16:13
Last Modified: 08 Jun 2022 05:15
URI: http://d-scholarship.pitt.edu/id/eprint/38413

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item