Balanced Item Pool Assembly in Computerized Adaptive Testing (RR-07-04)
by Dmitry I. Belov
In computerized adaptive testing (CAT), test items (i.e., questions) for administration to an individual test taker are selected from a pool of items with the goal of matching the difficulty level of the test to the ability level of the test taker. In the recent literature on CAT, researchers have developed methods for designing the CAT item pool as a special set of nonoverlapping forms reflecting the skill levels of an assumed population of test takers. The input includes the original item pool called the master pool, required test form characteristics (e.g., content coverage), the assumed test-taker ability levels, and the number of nonoverlapping forms to assemble. Two problems with this approach have been identified. First, since these methods produce test forms that maximize the measurement precision (called information in the mathematical model applied here) at corresponding ability levels, the best items from the master pool are depleted. Second, since all forms are assembled simultaneously, the optimization problem is quite large and potentially intractable.
To resolve both issues, this research introduces an additional input parameter—a threshold on the degree of information for each form at the corresponding ability level. This parameter allows the large optimization problem to be subdivided into smaller subproblems. By varying this parameter, both measurement accuracy and master pool utilization can be balanced. Then the direct problem identifies the maximum number of such nonoverlapping forms. When the master pool, test assembly constraints, and information threshold are fixed, there exists a certain ability density that will maximize the objective of the direct problem among all possible densities. The inverse problem is to identify such a density.
Based on combinatorial optimization techniques, direct and inverse algorithms are developed that provide a feasible solution to the direct and inverse problems. Computer experiments with a pool of Law School Admission Test (LSAT) items and LSAT assembly constraints are presented. The direct and inverse algorithms provide testing organizations with effective means to maintain their master pools and produce CAT pools that balance measurement accuracy and item exposure.