Some New Methods to Detect Person Fit in CAT (CT-99-03)
by Rob R. Meijer and Edith M. L. A. van Krimpen-Stoop, University of Twente, Enschede, The Netherlands
The purpose of person-fit analysis is to detect persons with response patterns that do not fit the expectations from a reasonable model of response behavior. The analysis may help to reveal the operation of such undesirable influences on test takers’ behavior as guessing or knowledge of correct answers due to test preview. The occurrence of misfitting response patterns may result in inappropriate test scores and, thus, involve serious consequences for test use, for example, a high volume of classification errors in educational and job selection.
To detect response patterns that do not fit a test model, several person-fit statistics have been proposed. Nearly all statistics are a mathematical function of the differences between the observed and expected item scores compared across items for a single examinee. If the distribution of the person-fit statistic is known, a statistical test can be used to classify response patterns as fitting or nonfitting.
To date, most fit statistics were proposed for use with conventionally administered paper-and-pencil (P&P) tests. With the increasing use of computerized adaptive testing (CAT), additional research is needed to develop person-fit statistics for use in CAT. In an earlier project, several existing person-fit statistics for P&P tests were studied in a CAT environment. Results showed that the use of these person-fit statistics was problematic because their empirical distributions were not in agreement with the theoretical distributions. The reason for this discrepancy is that CATs are typically much shorter than P&P tests and have items that are selected in an adaptive mode.
In the current project, eight new statistics based on cumulative-sum (CUSUM) procedures from Statistical Process Control theory are proposed. Four of these statistics were developed specifically to analyze person-fit in a CAT environment. The power of these statistics was explored in a large simulation study. With the original CUSUM procedures, normally distributed statistics are assumed. From this assumption, boundaries can be determined to decide when a process is out of control. In the current study, the statistics were not assumed to be normally distributed, but their boundaries were determined using simulated data. As it appeared, the boundaries were stable across the ability levels of the examinees. They can, therefore, be used safely in a large variety of applications. The results also showed that the statistics perform well and have detection rates comparable to those of traditional person-fit statistics for P&P tests.