Differential item functioning via robust scaling
This paper proposes a new method for assessing differential item functioning (DIF) in item response theory (IRT) models. The method does not require pre-specification of anchor items, which is its main virtue. It is developed in two main steps, first by showing how DIF can be re-formulated as a problem of outlier detection in IRT-based scaling, then tackling the latter using established methods from robust statistics. The proposal is a redescending M-estimator of IRT scaling parameters that is tuned to flag items with DIF at the desired asymptotic Type I Error rate during estimation. Theoretical results guarantee that the method performs reasonably well when fewer than one-half of the items on a test exhibit DIF. Data simulations show that the proposed method compares favorably to currently available approaches, and a real data example illustrates its application in a research context where pre-specification of anchor items is infeasible. The focus of the paper is the two-parameter logistic model in two independent groups, with extensions to other settings considered in the conclusion.
READ FULL TEXT