Cross Validating Item Parameter Estimation in Adaptive Testing (CT-00-04)
Wim J. van der Linden and Cees A. W. Glas, University of Twente, Enschede, The Netherlands
In computerized adaptive testing (CAT), items are selected to optimally match the estimates of the ability of the examinee. Though a match between items and ability is ideal from a measurement point of view, a potential threat to the validity of a CAT procedure is its tendency to capitalize on estimation error in the values of the item parameters. If this capitalization occurs, the algorithm selects the items for the examinee on the size of the estimation errors rather than the true values for the item parameters. In an earlier study for LSAC, these authors showed that the phenomenon does occur for calibration samples of 500–1,500 examinees and can have a strong biasing effect on the ability estimates, in particular for larger item pools.
The goal of this study was to investigate the effects of an attempt to neutralize the consequences of capitalization on estimation error through an application of the technique of cross validation of item parameter estimation. In this technique, the calibration sample is randomly split into two parts and the item parameter values are estimated separately on both parts. One set of estimates is used to select the items in the CAT procedure; the other set to update the ability estimates during testing. Two different types of splits were studied. In addition, both CAT with and without item exposure control (Sympson-Hetter method) was studied.
The results showed that the proposed technique of cross validation of item parameter estimation did have a positive effect, except for a very small calibration sample size (N = 250). It is concluded that cross validation can be used in practical applications provided the calibration sample size is larger than 250 test takers. For this sample size, the numbers of test takers in both random parts of the sample became simply too low to guarantee production of good ability estimates. Also, the presence of item exposure control appeared to neutralize the effects of capitalization on estimation error. The combination of cross validation and item exposure control did not produce any further effect beyond the effects of these two steps performed separately.