Using Patterns of Summed Scores in Paper and Pencil Tests and CAT to Detect Misfitting Item Score Patterns (CT0204)
by Rob R. Meijer, University of Twente, Enschede, The Netherlands
Executive Summary
In computerized adaptive testing (CAT), a mathematical model called item response theory (IRT) makes it possible to select items for administration to individual test takers that are matched to their ability level. IRT also allows us to determine the probability that a test taker of a particular ability level will answer individual test items correctly. In general, a test taker will have a 50% chance of correctly answering items that are matched to his or her ability level, with easier items being answered correctly with a higher probability and more difficult items being answered at a lower probability.
Because of the ability to determine the probability that an individual test taker will correctly answer a particular test item, IRT may be applied to evaluate the itemscore patterns of test takers to determine if they are responding in the manner that would be expected, given their ability level. Such investigations are commonly called personfit analyses. Aberrant item score patterns may indicate that the test taker has attempted to copy answers from another test taker or may indicate a problem with the test administration, such as a faulty answer key. Most personfit analyses that may currently be found in the literature are based on itemscore patterns.
In this paper, personfit statistics based on the likelihood of the numbercorrect scores on subsets of items in the test are studied. Personfit statistics for application in paperandpencil (P&P) tests and CATs were studied. Application of these statistics in CAT is possible if the itemselection algorithm selects testlets (small bundles of items) rather than individual items from the pool of items. The most significant result was that it is important to take the ability level of the test taker into account when numbercorrect scores on subtests are compared.
Using Patterns of Summed Scores in Paper and Pencil Tests and CAT to Detect Misfitting Item Score Patterns (CT0204)
Research Report Index
