Evaluating Equating Error in Observed-Score Equating (CT-04-03)
by Wim J. van der Linden, University of Twente, Enschede, The Netherlands

Executive Summary

The statistical process called test equating is used to adjust for minor differences in difficulty between two forms of the same test and assures that test scores are comparable even though they have been derived from different test forms. The problem addressed in this research is how to define equating error if the criterion of successful equating is that one should not be able to distinguish between an equated score on a test form and a score on the test form to which it has been equated. We formulate two equivalent definitions of equating error based on this criterion. One definition focuses on the error in the equated scores; the other focuses on the error in the equating transformation.

These error definitions were used to evaluate traditional equipercentile equating, wherein a test form is equated to a previous test form by identifying the test scores on the two forms that share a common percentile, and two new conditional equating methods. These methods were evaluated using two test forms assembled from a previous item pool of the Law School Admission Test (LSAT). It was shown that, under a variety of conditions, the equipercentile method tends to result in a serious degree of bias and error, whereas the new methods are practically free of any error, except when the test to be equated has poorly discriminating items.

Evaluating Equating Error in Observed-Score Equating (CT-04-03)

Research Report Index