Clustering and Association Analysis for High-Dimensional Omics StudiesLi, Yujia (2022) Clustering and Association Analysis for High-Dimensional Omics Studies. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractWith the rapid advancement of high-throughput technologies, a large amount of high-dimensional omics data has been generated in the public domain, which gives rise to various statistical and computational challenges in the cluster and association analysis of omics data. This dissertation focuses on estimation of tuning parameters in cluster analysis (Chapter 2), disease subtyping issues (Chapter 3) and association study between gene expression and multiple phenotypes (Chapter 4) in high-dimensional omics studies. In Chapter 2, we proposed a resampling framework called S4 for selecting tuning parameters in cluster analysis by measuring the similarity (i.e., stability) between the clustering result of the whole and subsampled data. S4 can estimate number of clusters for $K$-means as well as estimate number of clusters and sparsity parameter simultaneously for sparse $K$-means. Extensive simulations and nine real applications demonstrate superior performance of our proposed S4 method. In Chapter 3, we proposed a novel outcome-guided disease subtyping framework with weighted joint likelihood approach. Traditionally people utilize conventional cluster analysis (e.g., sparse K-means) to identify subgroups of patients with similar expression pattern, without consideration of outcome information. Therefore, the subgroups identified can be irrelevant to clinical outcome of interest. Our proposed method can solve this issue by incorporating outcome information in the cluster analysis, with good performance in both discovery and validation data. In Chapter 4, we study association between gene expression and multiple correlated phenotypes in complex disease. We extend two P-value combination methods, adaptive weighted Fisher’s method (AFp) and adaptive Fisher’s method (AFz), to tackle this problem. Based on extensive evaluation, AFp is recommended. A real lung disease transcriptomic application demonstrates insightful biological findings of AFp. Contribution to public health: Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|