Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Clustering Methodologies with Applications to Integrative Analyses of Post-mortem Tissue Studies in Schizophrenia

Wu, Qiang (2007) Clustering Methodologies with Applications to Integrative Analyses of Post-mortem Tissue Studies in Schizophrenia. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (1MB) | Preview


There is an enormous amount of research devoted to the understanding of the neurobiology of schizophrenia. Basic neurobiological studies have focused on identifying possible abnormal neurobiological markers in subjects with schizophrenia. However, due to the many possible combinations of symptoms, schizophrenia is clinically thought not to be a homogeneous disease, so that this possible heterogeneity might be explained neurobiologically in various brain regions. Statistically, the interesting problem is to cluster the subjects with schizophrenia with these neurobiological markers. But, in attempting to combine the neurobiological measurements from multiple studies, several experimental specifics arise that lead to difficulties in developing statistical methodologies for the clustering analysis. The main difficulties are differing control subjects, effects of covariates and existence of missing data. We develop new parametric models to successively deal with these difficulties. First, assuming no missing data and no clusters we construct multivariate normal models with structured means and covariance matrices to deal with the differing control subjects and the effects of covariates. We obtain several parameter estimation algorithms for these models and the asymptotic properties of the resulting estimators. Using these newly obtained results, we then develop model based clustering algorithms to cluster the subjects with schizophrenia into two possible subpopulations while still assuming no missing data. We obtain a new more effective algorithm for clustering and show by simulations that our new algorithm provides the same results in a relatively faster manner as compared to direct applications of some existing algorithms. Finally, for some actual data obtained from three studies conducted in the Conte Center for the Neuroscience of MentalDisorders in the Department of Psychiatry at the University of Pittsburgh, to handle the missingness we conduct imputations to create multiply imputed data sets using certain regression methods. The new complete data clustering algorithm is then applied to the multiply imputed data sets. The resulting multiple clustering results are integrated to form one single clustering of the subjects with schizophrenia to represent the uncertainty due to the missingness. The results suggest the existence of two possible clusters of the subjects with schizophrenia.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Wu, Qiangqiw8@pitt.eduQIW8
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairSampson, Allan Rasampson@stat.pitt.eduASAMPSON
Committee MemberLewis, David Alewisda@upmc.eduTNPLEWIS
Committee MemberGleser, Leon Jljg@stat.pitt.eduGLESER
Committee MemberIyengar, Satishsi@stat.pitt.eduSSI
Date: 27 September 2007
Date Type: Completion
Defense Date: 2 August 2007
Approval Date: 27 September 2007
Submission Date: 6 August 2007
Access Restriction: 5 year -- Restrict access to University of Pittsburgh for a period of 5 years.
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Statistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Clustering; Mixture models; Multiple imputations; Patterned covariance matrix; Schizophrenia; Structured means
Other ID:, etd-08062007-164618
Date Deposited: 10 Nov 2011 19:57
Last Modified: 19 Dec 2016 14:37


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item