Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Detection of influential observations in longitudinal multivariate mixed effects regression models

Ling, Yun (2014) Detection of influential observations in longitudinal multivariate mixed effects regression models. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Submitted Version

Download (850kB) | Preview


The purpose of this dissertation is to detect possible influential observations in longitudinal data with more than one observation per subject at each time point, that is, in multivariate longitudinal data. An influential observation is an observation which has large effect on the parameter estimation of a given model. Influential observations are important because: (1) removal of the observation(s) from the data set can substantially change the values of the estimated parameters; (2) in multivariate longitudinal mixed effect models, influential observations can affect the population and subject-specific trajectories; (3) influential observation(s) of one response may affect the predicted effects of the other response within the same individual; (4) an influential observation may indicate an abnormal or misdiagnosed subject.

This research was motivated by opthalmological clinical research in glaucoma. In many ophthalmology studies, both eyes are repeatedly measured. Sometimes one eye can be measured by different devices or measured for different quantities (retina thickness for different quadrants, OCT, VFI, etc.). For example, in one study considered in this dissertation, multivariate measurements (Retinal Nerve Fiber Layer (RNFL) thickness and Ganglion Cell Complex (GCC) thickness) were repeatedly measured on each eye, within each patient (cluster).

When we detect influential observations for longitudinal ophthamology data, our trajectory model must take into account three kinds of correlations: (1) correlation among different characteristics measured at the same time point within the same eye; (2) correlation among different time points; (3) correlation between characteristics in the two eyes.

In the first part of my dissertation, we propose a multivariate conditional version of Cook's distance for multivariate mixed effect models. Some research has shown that, in mixed effect models, influential observations having a large effect on subject-specific parameters cannot always be detected by the original Cook's distance due to large between-subject variation. Hence, in the multivariate longitudinal setting, the influential observation problem is better approached by conditioning on subjects and characteristics. Repeated simulations within this dissertation show that multivariate conditional Cook's distance successfully detected most 92.5% influential observations, but unconditional Cook's distance only detected 7.5%.

In the second part of the dissertation, we extend the multivariate conditional Cook's distance to multilevel multivariate mixed effect model. In this model, there are two levels of random effects to handle the subject level and cluster level correlations among different time points, and the residual covariance matrix to handle correlations among different responses. Also, the two-level multivariate conditional Cook's distance can be decomposed into six parts, indicating the influences of fixed effects, 1st and 2nd level of random effects, and the co-variation between them, respectively. Examples are given to illustrate how the influential observation in one characteristic changes the effects of both characteristics.

This research has public health implications because the influence of outliers can bias the results of any longitudinal study in public health. Hence, recognizing observations which have undue influence on study results ensures that reliable conclusions can be obtained in medical and public health research settings.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Ling, Yunyul27@pitt.eduYUL27
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairAnderson, Stewart Jsja@pitt.eduSJA
Committee MemberBilonick, Richard Arab45@pitt.eduRAB45
Bandos, Andriyanb61@pitt.eduANB61
Cheng, Yuyucheng@pitt.eduYUCHENG
Date: 27 June 2014
Date Type: Publication
Defense Date: 17 January 2014
Approval Date: 27 June 2014
Submission Date: 28 March 2014
Access Restriction: 5 year -- Restrict access to University of Pittsburgh for a period of 5 years.
Number of Pages: 94
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Influential observation, outlier, Multivariate, mixed effect model
Date Deposited: 27 Jun 2014 20:17
Last Modified: 01 May 2019 05:15


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item