Outlier Detection in High-Stakes College Entrance Testing (CT-01-08)
Rob R. Meijer, University of Twente, Enschede, The Netherlands

Executive Summary

Though the development of computerized adaptive testing (CAT) has resulted in more efficient educational and psychological measurement, it has also generated new practical and theoretical problems. One theoretical problem that arises is the identification of item score patterns (correct and incorrect responses) for particular test takers that do not conform to what would be expected based on the mathematical model being applied. An example of such an aberrant item score pattern would be that of a test taker who answered many easy items incorrectly and many difficult items correctly. If a test taker has an item score pattern that does not fit, the pattern is unlikely to give valuable information about the test taker's ability, but may point toward other behavior during the test. Explanations of aberrant item score patterns in CAT differ from those in paper-and-pencil testing. For example, in paper-and-pencil testing, answer copying may result in unexpected item scores, whereas in a CAT, direct answer copying is impossible because different test takers are administered different tests.

Several recent person-fit methods for CAT were studied. The sampling distributions of these statistics were studied empirically, and it was shown that the statistics have sufficient power to relate different patterns of responses to different types of misfit in the test taker's behavior.

Outlier Detection in High-Stakes College Entrance Testing (CT-01-08)

Research Report Index