Assessing Fit of Item Response Models for Performance Assessments using Bayesian Analysis

Zhu, Xiaowen (2009) Assessing Fit of Item Response Models for Performance Assessments using Bayesian Analysis. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (5MB) | Preview

Abstract

Assessing IRT model-fit and comparing different IRT models from a Bayesian perspective is gaining attention. This research evaluated the performance of Bayesian model-fit and model-comparison techniques in assessing the fit of unidimensional Graded Response (GR) models and comparing different GR models for performance assessment applications.The study explored the general performance of the PPMC method and a variety of discrepancy measures (test-level, item-level, and pair-wise measures) in evaluating different aspects of fit for unidimensional GR models. Previous findings that the PPMC method is conservative were confirmed. In addition, PPMC was found to have adequate power in detecting different aspects of misfit when using appropriate discrepancy measures. Pair-wise measures were found more powerful in detecting violations of unidimensionality and local independence assumptions than test-level and item-level measures. Yen's Q3 measure appeared to perform best. In addition, the power of PPMC increased as the degree of multidimensionality or local dependence among item responses increased. Two classical item-fit statistics were found effective for detecting the item misfit due to discrepancies from GR model boundary curves.The study also compared the relative effectiveness of three Bayesian model-comparison indices (DIC, CPO, and PPMC) for model selection. The results showed that these indices appeared to perform equally well in selecting a preferred model for an overall test. However, the advantage of PPMC applications is that they can be used to compare the relative fit of different models, but also evaluate the absolute fit of each individual model. In contrast, the DIC and CPO indices only compare the relative fit of different models.This study further applied the Bayesian model-fit and model-comparison methods to three real datasets from the QCAI performance assessment. The results indicated that these datasets were essentially unidimensional and exhibited local independence among items. A 2P GR model provided better fit than a 1P GR model, and a two-dimensional model was also not preferred. These findings were consistent with previous studies, although Stone's fit statistics in the PPMC context identified less misfitting items compared to previous studies. Limitations and future research for Bayesian applications to IRT are discussed.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Zhu, Xiaowen	xiz28@pitt.edu	XIZ28

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Stone, Clement A	cas@pitt.edu	CAS
Committee Member	Ye, Feifei	feifeiye@pitt.edu	FEIFEIYE
Committee Member	Bost, James E	bostje@upmc.edu
Committee Member	Lane, Suzanne	sl@pitt.edu	SL

Date:

11 December 2009

Date Type:

Completion

Defense Date:

20 November 2009

Approval Date:

11 December 2009

Submission Date:

7 December 2009

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Institution:

University of Pittsburgh

Schools and Programs:

School of Education > Psychology in Education

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

IRT model-comparion; Local independence; MCMC; Multidimensional models; Polytomous IRT models; PPMC; Unidimensionality; WinBUGS; IRT model-fit; Item-fit

Other ID:

http://etd.library.pitt.edu/ETD/available/etd-12072009-163421/, etd-12072009-163421

Date Deposited:

10 Nov 2011 20:09

Last Modified:

15 Nov 2016 13:53

URI:

http://d-scholarship.pitt.edu/id/eprint/10162

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Assessing Fit of Item Response Models for Performance Assessments using Bayesian Analysis

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds