On the Use of Collateral Item Response Information to Improve Pretest Item Calibration (CT-98-13)
William Stout, Terry Ackerman, Dan Bolt, Amy Goodwin Froelich, Dan Heck, University of Illinois at Urbana-Champaign
An increasingly important issue in considering the feasibility of a computerized Law School Admission Test (LSAT) is the development of accurate statistical estimation tools that work well for the modest amounts of test taker data per test question (”item”) that are typical of computerized tests. This is especially important for pretest items, whereby, currently, a pretest item is an item administered on an earlier exam as a nonoperational item in order to assess its characteristics so the item can be effectively used operationally on future LSAT administrations.
That LSAT currently consists of three item types: logical reasoning (LR), analytical reasoning (AR), and reading comprehension (RC). Even though separate scores are not currently reported for the different item types, the reporting of separate scores for at least some item types could be considered for a computerized LSAT. Thus, for calibrating pretest items for one item type, the use of collateral information from some other item type could be helpful.
The project’s specific goal is the investigation of the use of such collateral information to aid in reducing the number of test takers per pretest item required for calibrating test items. For the LSAT, collateral information may be of particular value in calibrating pretest items, because, as previous studies have suggested, the LSAT measures two dominant abilities (”dimensions”). One such dimension is defined by the AR items, and the second by the combined LR and RC items. While in practice, pretest RC items may be calibrated by considering only RC operational item responses, it is also possible to use test taker performances on AR items as well (i.e., collateral information from the AR items) to aid in the calibration of pretest RC items.
This study evaluates the practical benefit (if any) of using collateral information from one item type when statistically analyzing pretest items of some other item type. The criterion for evaluation of pretest item calibration accuracy was the reduction achieved by the use of collateral information in the number of test takers that must be administered each pretest item so that its item parameter estimates reach a specified level of accuracy. Our proposed methods involve both statistical tools that presume only one dominant ability being measured on a test and tools that assume two dominant abilities being measured, as has been confirmed to be true for the LSAT by statistical analyses as mentioned above. They also involve tools that include and exclude an intermediate ability estimation stage. These methods are described in detail and evaluated through simulation studies.