New statistical methods for complex survival data with high-dimensional covariatesSun, Tao (2020) New statistical methods for complex survival data with high-dimensional covariates. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractComplex survival outcomes, such as multivariate and interval-censored endpoints, are becoming more commonly used in clinical trials. The revolutionary development of genetics technologies allows the generation of large-scale genetic data. This dissertation proposes new statistical methods for complex survival outcomes with high-dimensional covariates. In the first part, to deal with bivariate interval-censored data, we propose a flexible two-parameter copula-based model with semiparametric transformation margins. We estimate the model parameters by the sieve likelihood approach and establish the asymptotic properties of the sieve estimators. We demonstrate satisfactory estimation and inference performance in simulation studies. Lastly, we apply our method to the Age-Related Macular Degeneration Study (AREDS) data and successfully identify novel genetic variants associated with the progression of Age-related Macular Degeneration (AMD). An R package CopulaCenR is published for analyzing bivariate censored data in a regression setting. In the second part, we develop a novel information-ratio-based test statistic to evaluate the goodness-of-fit of copula survival models. We establish the asymptotic properties of our test statistic. The simulation studies demonstrate that our method performs well under interval and right censoring. Lastly, we evaluate our results in multiple real data sets. To the best of our knowledge, our method is the first approach that can test any parametric copula model under both interval and right censoring. In the third part, motivated by recent demanding needs for developing accurate survival prediction models utilizing rich genetic data, we develop a novel framework for constructing and evaluating a deep neural network (DNN) based survival model. Our simulation results clearly demonstrate the high predictive power of the DNN survival model, especially in the presence of complex data structures. We also build an accurate and interpretable DNN survival prediction model for AMD progression using AREDS data. Public health significance: This dissertation provides a comprehensive set of novel statistical and computational tools for analyzing bivariate survival outcomes with large-scale genetic data, which have the potential to fundamentally improve the current practice in analyzing such clinical studies, and thus to enhance the understanding of disease progression and to increase the success of individualized risk management and precision medicine. Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|