****JavaScript based drop down DHTML menu generated by NavStudio. (OpenCube Inc.  http://www.opencube.com)****

FixedWeight Methods of Scoring ComputerBased Adaptive Tests (CT9712) by Bert F. Green, Johns Hopkins University Executive Summary Computerbased adaptive tests (CATs) are beginning to replace traditional paperandpencil tests. Several major standardized tests are now given as CATs, and the Law School Admission Council (LSAC) is investigating the feasibility and potential benefits of computerizing the Law School Admission Test (LSAT). Not only are CATs more efficient than traditional tests, but they are more popular among test takers. CAT gains its efficiency by tailoring the difficulty of the items given to a test taker, based on that test taker's response to earlier items on the test. Scoring a CAT is complex; the number of items answered correctly is not an appropriate test score because the difficulty of the items must be taken into account. Indeed, most test takers answer about the same number of items correctly. Currently available CAT scoring algorithms use either maximumlikelihood or Bayesian procedures, which are statistically elegant, but extremely difficult to explain to a nonstatistician. The present research project examines a simpler way of scoring a CAT. Current methods of scoring CATs, including maximumlikelihood estimation and equated numberright, estimate the score (or proficiency) of the test taker as a weighted combination of item scores. The weights for the items for these methods depend not only on item characteristics but on the proficiency of the test taker. A correctly answered item can weigh heavily in one test taker's score, and the same item, also correctly answered, can have very little weight in another test taker's score, depending on the other item responses given by those test takers. Such differential treatment is statistically optimal, but it is not easily explained to test takers. The methods proposed here give a fixedweight to each item, based only on item difficulty, and will be called fixedweight scores. Computer simulations of CATs were done in order to compare the fixedweight scores with the more sophisticated scores (e.g., maximumlikelihood and equated numberright). Two different item pools were used: a "flat" pool with item difficulties uniformly distributed across the scale, and a "special" pool with item parameters more closely matching those found in LSAT item pools. For each of 26 different levels of proficiency, 2,500 simulated test takers responded to a 30item CAT. Each CAT was scored by each method, and the results were compared. All scoring methods, including the new fixedweight scoring methods, provided statistically unbiased estimates of test taker proficiency. The precision, as assessed by the root mean squared error, was somewhat poorer for the fixedweight scores than for the statistically efficient scores provided by maximumlikelihood and equated numberright scoring. Errors were about 20% larger for the fixedweight scores. A fixed (nonadaptive) test of the same length, scored by maximumlikelihood, has a measurement error of about 60% higher than a CAT score by maximumlikelihood. That is, most of the advantage of adaptive testing is retained if the simpler fixedweight scoring system is used. Since the fixedweight scores are very highly correlated with the maximumlikelihood and equated numberright scores, one can consider maximumlikelihood and equated numberright scores to be statistically refined versions of the proposed fixedweight methods. 