Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Prediction Accuracy of SNP Epistasis Models Generated by Multifactor Dimensionality Reduction and Stepwise Penalized Logistic Regression

Perkins, Amy M. (2010) Prediction Accuracy of SNP Epistasis Models Generated by Multifactor Dimensionality Reduction and Stepwise Penalized Logistic Regression. Master's Thesis, University of Pittsburgh. (Unpublished)

Primary Text

Download (498kB) | Preview


Conventional statistical modeling techniques, used to detect high-order interactions between SNPs, lead to issues with high-dimensionality due to the number of interactions which need to be evaluated using sparse data. Statisticians have developed novel methods Multifactor Dimensionality Reduction (MDR), Generalized Multifactor Dimensionality Reduction (GMDR), and stepwise Penalized Logistic Regression (stepPLR) to analyze SNP epistasis associated with the development of or outcomes for genetic disease. Due to inconsistencies in published results regarding the performance of these three methods, this thesis used data from the very large GenIMS study to compare the prediction accuracies of 90-day mortality in SNP epistasis models. Comparisons were made using prediction accuracy, sensitivity, specificity, model consistency, chi-square tests, sign tests, and biological plausibility. Testing accuracies were generally higher for GMDR compared to MDR, and stepPLR yielded substandard performance since the models predicted that all subjects were alive at ninety days. Stepwise PLR, however, determined that IL-1A SNPs IL1A_M889, rs1894399, rs1878319, and rs2856837 were each significant predictors of 90-day mortality when adjusting for the other SNPs in the model. In addition, the model included a borderline significant, second-order interaction between rs28556838 and rs3783520 associated with 90-day mortality in a cohort of patients hospitalized with community-acquired pneumonia (CAP). The public health importance of this thesis is that the relative risk for CAP may be higher for a set of SNPs across different genes. The ability to predict which patients will experience a poor outcome may lead to more effective prevention strategies or treatments at earlier stages. Furthermore, identification of significant SNP interactions can also expand the scientific knowledge about biological mechanisms affecting disease outcomes. Altogether, the GMDR method yielded higher prediction accuracies than MDR, and MDR performed better than stepPLR when establishing SNP epistasis models associated with 90-day mortality in the GenIMS cohort.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Perkins, Amy
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairAnderson, Stewartsja@pitt.eduSJA
Committee MemberKong, Lanlkong@pitt.eduLKONG
Committee MemberYende, Sachinyendes@upmc.eduSPY3
Date: 27 September 2010
Date Type: Completion
Defense Date: 29 July 2010
Approval Date: 27 September 2010
Submission Date: 26 July 2010
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: community-acquired pneumonia; biostatistics; single nucleotide polymorphism
Other ID:, etd-07262010-232602
Date Deposited: 10 Nov 2011 19:54
Last Modified: 19 Dec 2016 14:36


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item