Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Assessing the impact of missing outcome data on the longitudinal clinical trial during the COVID-19 pandemic: a simulation study for sensitivity analysis using imputation methods

Wu, Shan (2022) Assessing the impact of missing outcome data on the longitudinal clinical trial during the COVID-19 pandemic: a simulation study for sensitivity analysis using imputation methods. Master's Thesis, University of Pittsburgh. (Unpublished)

Download (791kB) | Preview


Introduction: The COVID-19 pandemic raises various challenges for clinical trials, including more missing outcome data with complicated missingness. A limited amount of research has been conducted to assess the impact of COVID-19-related missingness and relative mitigating strategies.
Methods: We conducted a simulation study by varying missingness models under the missing at random or missing completely at random mechanisms. First, we explored the potential impact of missingness in longitudinal outcomes under these missingness scenarios. Empirical power, Type I error rates, and standard error estimates were compared to explore efficiency loss or bias caused by the different missingness models. Second, we compared single imputation (SI) and three multiple imputation (MI) methods to improve the potential problems with missing data. The SI with Predictive mean matching, the MI with Bayesian linear model, random forests, and multilevel imputation using the PAN algorithm were applied to impute missing longitudinal outcomes sequentially by visit times or simultaneously.
Results: Using observed data only (i.e., ignoring missingness), the best power and Type I error rates for the treatment-by-time interaction effect yield unbiased estimates using the correct outcome model. However, power and efficiency were lower due to missing data. The proposed SI and MI methods improved the statistical power and efficiency of fixed effect estimates in specific missingness settings with inflated Type I error rates in our simulation study. The simultaneous MIs showed similar or superior performance to sequential MI when longitudinal outcomes have higher missing rates, while the sequential SI is always better than simultaneous SI. Multilevel MI produced estimates with large between-imputation variance, and thus a more significant number of imputations was required.
Conclusions: The imputations do not guarantee the best performance regarding efficiency gain, and the performance significantly depends on the methods, missing rates, and the imputation procedure. When using SI and MI for efficiency gain, care should be taken as there can be inflated false positive findings under certain scenarios due to the imputation procedure.
Public Health Significance: This study contributes to understanding the missing data problem in the longitudinal clinical trials with complex missingness patterns associated with the COVID-19 pandemic.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Wu, Shanshw134@pitt.edushw134
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairKang, Chaeryoncrkang@pitt.educrkang
Committee CoChairCarlson, Jenna Colavincenzojnc35@pitt.edujnc35
Committee MemberErickson, Kirk Ikiericks@pitt.edukiericks
Date: 12 May 2022
Date Type: Publication
Defense Date: 21 April 2022
Approval Date: 12 May 2022
Submission Date: 29 April 2022
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 53
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: longitudinal missing data, multiple imputation, power analysis
Date Deposited: 12 May 2022 14:54
Last Modified: 12 May 2022 14:54


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item