Chok, Nian Shong
(2010)
Pearson's Versus Spearman's and Kendall's Correlation Coefficients for Continuous Data.
Master's Thesis, University of Pittsburgh.
(Unpublished)
Abstract
The association between two variables is often of interest in data analysis and methodological research. Pearson's, Spearman's and Kendall's correlation coefficients are the most commonly used measures of monotone association, with the latter two usually suggested for non-normally distributed data. These three correlation coefficients can be represented as the differently weighted averages of the same concordance indicators. The weighting used in the Pearson's correlation coefficient could be preferable for reflecting monotone association in some types of continuous and not necessarily bivariate normal data.In this work, I investigate the intrinsic ability of Pearson's, Spearman's and Kendall's correlation coefficients to affect the statistical power of tests for monotone association in continuous data. This investigation is important in many fields including Public Health, since it can lead to guidelines that help save health research resources by reducing the number of inconclusive studies and enabling design of powerful studies with smaller sample sizes.The statistical power can be affected by both the structure of the employed correlation coefficient and type of a test statistic. Hence, I standardize the comparison of the intrinsic properties of the correlation coefficients by using a permutation test that is applicable to all of them. In the simulation study, I consider four types of continuous bivariate distributions composed of pairs of normal, log-normal, double exponential and t distributions. These distributions enable modeling the scenarios with different degrees of violation of normality with respect to skewness and kurtosis.As a result of the simulation study, I demonstrate that the Pearson's correlation coefficient could offer a substantial improvement in statistical power even for distributions with moderate skewness or excess kurtosis. Nonetheless, because of its known sensitivity to outliers, Pearson's correlation leads to a less powerful statistical test for distributions with extreme skewness or excess of kurtosis (where the datasets with outliers are more likely). In conclusion, the results of my investigation indicate that the Pearson's correlation coefficient could have significant advantages for continuous non-normal data which does not have obvious outliers. Thus, the shape of the distribution should not be a sole reason for not using the Pearson product moment correlation coefficient.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
24 September 2010 |
Date Type: |
Completion |
Defense Date: |
26 May 2010 |
Approval Date: |
24 September 2010 |
Submission Date: |
9 June 2010 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Public Health > Biostatistics |
Degree: |
MS - Master of Science |
Thesis Type: |
Master's Thesis |
Refereed: |
Yes |
Uncontrolled Keywords: |
Pearson product moment correlation coefficient |
Other ID: |
http://etd.library.pitt.edu/ETD/available/etd-06092010-123415/, etd-06092010-123415 |
Date Deposited: |
10 Nov 2011 19:46 |
Last Modified: |
19 Dec 2016 14:36 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/8056 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |