Xu, Ting
(2017)
Equating with local dependence under the anchor test design.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
This is the latest version of this item.
Abstract
Item response theory (IRT) models are often used in test equating. The effectiveness of IRT equating depends upon how well test data meet the IRT model assumptions. When tests are composed of testlets (i.e., groups of items sharing a common stimulus), the assumption of local item independence is likely to be violated. When examinees are nested within groups (e.g., classrooms, schools, etc.), the assumption of local person independence (i.e., independence of subjects) is unlikely to hold. Multilevel models allow the flexibility of modeling item and person dependence structures simultaneously.
This research investigated the effectiveness of multilevel models as concurrent calibration models on test equating under the anchor test design with the presence of local dependence. The performance of multilevel models was compared to that of traditional IRT models and testlet response theory (TRT) model through two simulation studies. Local item dependence (LID) was considered in the first study, whereas both LID and person dependence were considered in the second study.
The first study compared the performance of four concurrent calibration approaches on equating testlet-based tests: (a) modeled LID using a three-level hierarchical generalized linear model (HGLM); (b) ignored LID and used a two-level HGLM; (c) ignored LID and used the Rasch model; and (d) used testlet scoring and applied the graded-response model (GRM). The results suggested that the two-level HGLM and the Rasch approaches were robust to the violation of the local item independence assumption, in terms of expected score recovery. In addition, the first three approaches provided better equating results than concurrent calibration using the GRM. Further research confirmed previous findings that degree of LID affected the precision of person parameter estimates.
The second study compared the performance of three models (i.e., 3PL IRT model, 3PL TRT model, and 3PL multilevel TRT model) as concurrent calibration models on equating testlet-based tests when examinees were nested within groups. The results showed that ignoring LID affected item parameter recovery. With the presence of both LID and person dependence, the 3PL multilevel TRT model provided the most accurate estimation for person parameters, especially with a high degree of person dependence.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
13 January 2017 |
Date Type: |
Publication |
Defense Date: |
20 April 2016 |
Approval Date: |
13 January 2017 |
Submission Date: |
8 January 2017 |
Access Restriction: |
5 year -- Restrict access to University of Pittsburgh for a period of 5 years. |
Number of Pages: |
132 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Education > Psychology in Education |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Item response theory, equating, testlet, local dependence, multilevel models, hierarchical generalized linear model (HGLM) |
Date Deposited: |
13 Jan 2017 22:38 |
Last Modified: |
13 Jan 2022 06:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/30669 |
Available Versions of this Item
-
Equating with local dependence under the anchor test design. (deposited 13 Jan 2017 22:38)
[Currently Displayed]
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |