Incorporating Content Constraints into a Multi-Stage Adaptive Testlet Design (CT-97-02)
by Lynda M. Reese, Deborah L. Schnipke, and Stephen W. Luebke
In a standard computerized adaptive test (CAT) design, test takers are first administered a test question of approximately middle difficulty. Based on their response, an attempt is made to choose subsequent items for administration that are more appropriate for their ability level. Testing proceeds until some termination criterion, such as a fixed test length or a sufficiently precise ability estimate, is achieved. In this pure form, CAT holds many theoretical advantages. Because the test taker's time is not wasted on test items that are too difficult or too easy, test length may be reduced, usually by about one half, without loss of precision.
As large-scale, high-stakes testing programs such as the LSAT consider converting to a computerized adaptive mode of test administration, a standard CAT, as described above, is rarely practical. Most large-scale testing programs contemplating CAT must face the challenge of maintaining content balancing requirements which usually compromise the efficiency and precision that make CAT attractive. Other concerns about a CAT include how to deal with set-bound items (items that refer to a common stimulus) and whether to allow item review (i.e., allow test takers to change previous responses). Efficient utilization of the item pool is also a concern when developing a CAT design. Some researchers have advocated the use of testlets (or collections of items) as an alternative to individually selected and delivered items. These testlets may be pre-assembled to achieve certain content coverage requirements. Testlets may also facilitate administering set-bound items, allowing item review, and efficient item pool utilization.
This study first evaluated whether realistic content constraints could be met by carefully assembling testlets and appropriately selecting testlets for each test taker that, when combined, would meet the content requirements of the test and would be adapted to the test taker's ability level. Second, the precision of the content balanced testlet design was compared with that achieved by the current paper-and-pencil version of the test through data simulation. The results revealed that constraints to control for item exposure, testlet overlap, and efficient pool utilization need to be incorporated into the testlet assembly algorithm. More refinement of the statistical constraints for testlet assembly are also necessary. However, even for this preliminary attempt at assembling content-balanced testlets, the two-stage computerized test simulated with these testlets performed quite well.