Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form


Yin, Liqun (2013) THE ROBUSTNESS OF IRT-BASED VERTICAL SCALING METHODS TO VIOLATION OF UNIDIMENSIONALITY. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (1MB) | Preview


In recent years, many states have adopted Item Response Theory (IRT) based vertically scaled tests due to their compelling features in a growth-based accountability context. However, selection of a practical and effective calibration/scaling method and proper understanding of issues with possible multidimensionality in the test data is critical to ensure their accuracy and reliability. This study aims to use Monte Carlo simulation to investigate the robustness of various unidimensional scaling methods under different test conditions and different degrees of departure from unidimensionality in common-items nonequivalent groups design (grades 3 to 8). The main research questions answered by this research are: 1) Which calibration/scaling methods, concurrent, semi-concurrent, separate calibration with SL scaling, separate calibration with mean/sigma scaling, and pair-wise calibration, yield least biased ability estimates in the vertical scaling context? 2) How do different degrees of multidimensionality affect use of the methods?
Results indicate that various calibration and scaling methods perform very differently under different test conditions, especially when the grades are furthest away from the base grade. Under unidimensional condition, the five calibration and linking methods produced very similar results when the grades are close to the base grade 5. However, for grades 7 and 8, semi-concurrent and concurrent calibrations yielded more biased results while the results for the other three are comparable. Under multidimensional conditions, all five methods produced more biased results and the bias patterns differed across methods. In general, the more severe the multidimensionality is, the larger the biases are. Among the five methods compared, separate calibration with SL linking is the most robust to variations in multidimensionality.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Yin, Liqunliy15@pitt.eduLIY15
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairLane, Suzannesl@pitt.eduSL
Committee MemberStone, Clementcas@pitt.eduCAS
Committee MemberYe, Feifeifeifeiye@pitt.eduFEIFEIYE
Committee MemberKirisci, Leventlevent@pitt.eduLEVENT
Date: 2013
Date Type: Publication
Defense Date: 12 April 2013
Approval Date: 13 May 2013
Submission Date: 1 May 2013
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 181
Institution: University of Pittsburgh
Schools and Programs: School of Education > Psychology in Education
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: IRT calibration and linking, vertical scaling, item response theory, unidimensionality assumption, linking and equating
Date Deposited: 13 May 2013 18:03
Last Modified: 15 Nov 2016 14:12


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item