Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Assessing Agreement Among Raters And Identifying Atypical Raters Using A Log-Linear Modeling Approach

Kastango, Kari B. (2006) Assessing Agreement Among Raters And Identifying Atypical Raters Using A Log-Linear Modeling Approach. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (806kB) | Preview


When an outcome is rated by several raters, ensuring consistency across raters increases the reliability of the measurement. Tanner and Young (1985) proposed a general class of log-linear models to assess agreement among K raters and a rating scale with C nominal categories. Their methodology can be used to assess pair-wise agreement among three or more raters. Rogel et al. (1996, 1998) extended this work by assessing various patterns of agreement among rater sub-groups of size K-1. These models can be used to test the assumption of rater exchangeability. Although parameters from these models can be used to identify atypical raters, no formal inferential procedures are available. I propose a formal inferential approach that can be used to test the assumption of rater exchangeability and to identify an atypical rater. The global and heterogeneous partial agreement model is fit to the data and pair-wise comparisons of the K partial agreement parameters are made, adjusting the p-values for the multiple comparisons made. The heterogeneous partial agreement parameter that is constantly involved in the pair-wise comparisons that are statistically significant is distinguished. The premise is that, if there is an atypical rater, at least one heterogeneous partial agreement parameter will differ from at least one of the remaining K-1 partial agreement parameters. The approach is illustrated using published data from an intestinal biopsy rating study with six raters (Rogel et al., 1998). Overall Type I error and the power of the inferential approach to correctly identify atypical raters are assessed via simulation with rater sub-groups of size 5. The Bonferroni, Sidak, and Holm's Step-down procedures using the Bonferroni and Sidak adjustments are used to control the overall Type I error. Being able to correctly identify an atypical rater, if present, and improving the consistency of ratings directly, influence the reliability of the measurement and the power of the study for a given sample size. Consequently, more informative studies can be conducted of interventions (e.g., behavioral, medicinal) that may have a significant positive impact on the public's health.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Kastango, Kari
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairStone, Roslyn
Committee MemberMulsant, Benoit
Committee MemberRockette, Howard; pfisher@pitt.eduHERBST
Committee MemberDew, Mary Amandadewma@upmc.eduDEW1
Committee MemberMazumdar, Satimaz1@pitt.eduMAZ1
Date: 6 June 2006
Date Type: Completion
Defense Date: 23 March 2006
Approval Date: 6 June 2006
Submission Date: 30 March 2006
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: heterogeneity; homogeneity; nominal; reliability
Other ID:, etd-03302006-125650
Date Deposited: 10 Nov 2011 19:33
Last Modified: 19 Dec 2016 14:35


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item