Olson Hunt, Megan
(2014)
A permutation-based correction for Pearson's chi-square test on data with an imputed complex outcome / A modified EM algorithm for contingency table analysis with missing data.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Studies on human subjects often yield missing data, making progress in this field of inherent public health relevance. Here, two statistical methods are proposed for the analysis of discrete data with missing values. First, when one variable is subject to missingness, it was noted the application of Pearson’s chi-square test to singly-imputed data undermines the variability due to imputation, leading to a type-I error rate larger than the nominal level. This research concerns Pearson’s test on data with an imputed complex outcome, where one of its components suffers from missing values. Imputation in this context may be performed either directly through conditional imputation of the complex outcome given covariates, or indirectly through conditional imputation of its missing component given the covariates and the other, observed component. Although the latter imputation scheme is shown to be more efficient, an existing adjustment method cannot be extended to this scenario due to the lack of independence amongst the variables constituting the complex outcome. As a result, a novel permutation-based correction method for Pearson’s test is proposed. Simulation studies indicate it provides the nominal rejection rate under the null. Second, a modification of the expectation maximization (EM) algorithm for the analysis of discrete data with missing values is presented. In general, the update in the M-step requires either knowing or modeling the missing-data mechanism. However, misspecification of this mechanism may lead to biased estimates of model parameters. Given consistent initial estimates of the parameters (which may be obtained from an external, complete data set, or by recalling a random sample of subjects), the target function is approximated in the M-step with empirical estimates, allowing for unbiased estimation without specification or modeling of the often intangible missing-data mechanism. Simulation studies show this modified algorithm yields consistent estimates potentially more efficient than the initial estimates, even under non-ignorable missingness.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
27 June 2014 |
Date Type: |
Publication |
Defense Date: |
3 April 2014 |
Approval Date: |
27 June 2014 |
Submission Date: |
6 April 2014 |
Access Restriction: |
5 year -- Restrict access to University of Pittsburgh for a period of 5 years. |
Number of Pages: |
116 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Public Health > Biostatistics |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
single imputation, discrete data, bias, consistency, efficiency, MNAR, empirical |
Date Deposited: |
27 Jun 2014 20:22 |
Last Modified: |
01 May 2019 05:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/21457 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |