Exploring Issues of Test Taker Behavior: Insights Gained from Response-Time Analyses (CT-98-09)
by Deborah L. Schnikpe and David J. Scrams, Law School Admission Council
In 1995, the Law School Admission Council (LSAC) began a five-year plan to research the advisability and feasibility of administering the Law School Admission Test (LSAT) by computer. The research plan is summarized in the Computerized LSAT Research Agenda. The Research Agenda suggests that the unobtrusive recording of item response times is one of the many advantages offered by computerized test administration. For example, analyses of item response times may lead to innovative ways to address speededness issues and scheduling problems as well as methods for identifying and investigating test taker strategies. Additionally, response times may reveal aspects of test taker proficiency that are not represented by response selections, and they may serve as an important consideration for equity and fairness reviews.
The present work is a broad review of psychometric literature on response times. It is organized into seven major sections: scoring models, speed-accuracy relationships, strategy usage, speededness, pacing, predicting finishing times/setting time limits, and subgroup differences. These categories are neither exclusive nor exhaustive but they serve as a general organization of the literature to date. There is a final section focused on recommendations for future work.
Several theorists have offered scoring models that depend either wholly or partially on response-time measures. These models generally include assumptions about the expected response-time distributions, relationships between response speed and response accuracy, and the nature of items for which the models are appropriate. The assumed response-time distributions are generally consistent with empirical results, but rigorous model-fitting tests are rare. Similarly, the assumed relationship between speed and accuracy often takes a form that is difficult to test empirically under standard test administrations. Finally, the items most often used as the basis for response-time models are more similar to simple cognitive tasks (e.g., perceptual speed tasks) than to items that are found on large-scale standardized assessments such as the LSAT. The general conclusion is that although scoring models incorporating response time are not yet ready for operational use, considerable advances have been made along these lines. Unfortunately, no validity studies have been offered that address the utility of the resulting scores.
Many of the scoring models rely on an assumed relationship between speed and accuracy, but a distinction must be made between within- and across-test-taker speed-accuracy relationships. Cognitive psychologists have been primarily interested in within-test-taker relationships (i.e., the speed-accuracy tradeoff: test takers are able to increase their response speed at the expense of accuracy or vice versa). Psychometricians have been primarily interested in across-test-taker relationships (i.e., fast test takers may be either more or less accurate than their slower counterparts). Some scoring models incorporate within-test-taker assumptions and others incorporate across-test-taker assumptions. Empirical psychometric work on speed-accuracy relationships, however, has focused exclusively on across-test-taker relationships, and work along these lines tends to show that the relationship depends heavily on test context and content. Useful scoring models must be flexible enough to accommodate these content and context effects.
Research on the use of response times to identify strategy usage is compelling but requires a reasonable understanding of possible strategies and their relationship to response times. Most of this work has used simple perceptual tasks (e.g., spatial-visualization tasks) or relatively well-understood cognitive tasks (e.g., mixed-number subtraction). This work has also been limited to differentiating between two strategies. Developing the requisite understanding of strategy usage for the item types currently used on the LSAT would require considerable further research, and using response times to identify these strategies would require extending the methodologies to account for multiple strategies.
Work on the use of response times to identify strategies has spawned a promising line of research on speededness. Traditionally, speededness work has focused exclusively on the number (and distribution) of unanswered items. With the availability of item response times, researchers recognized their value as additional sources of speededness information. Theorists have introduced the concept of rapid-guessing behavior: speeded test takers may quicken their pace considerably and begin answering less accurately as time expires. Such behavior is readily identifiable by examining patterns of response times along with accuracy. This research would have direct implications for operational use and requires little future research for implementation.
Discussions of rapid-guessing behavior have also brought attention to more general issues of test taker pacing. Although little work has been done in this area, researchers have begun to explore methodologies for identifying different pacing strategies. Preliminary work suggests that test takers engage in numerous pacing strategies. These include devoting considerable time to early items followed by either rapid guessing or failure to finish, responding rapidly to early items to allow more time for later items, maintaining constant speed throughout an examination, and allocating time according to item difficulty. Some interest has been focused on subgroup differences in pacing strategies, and work in this area suggests that some subgroups may tend to engage in less optimal pacing strategies than other subgroups.
Some researchers have attempted to use response times to tackle the practical issues related to setting time limits. This work has generally involved predicting finishing times, but researchers have experienced only limited success. The standard approach has been to estimate total time on the basis of item response times or to predict finishing times on the basis of statistical regression. Neither approach has been overly successful, and field tests have often resulted in changes to the time limits suggested by previous research.
Response times may also be useful in terms of equity and fairness considerations. Some researchers have investigated subgroup differences in response times, and although the results are mixed, there appear to be only slight subgroup differences in response times. Unfortunately, most researchers have examined only mean or median response times rather than distributions of response times. Future work using full response-time distributions will likely be more rigorous. Investigating these issues within the context of a response-time model may also be advantageous.
Response-time research is growing in popularity along with the computerization of test administrations. Continued growth along the lines discussed in the present work is likely. There are also other important applications of response-time research that have received little attention. A close connection to cognitive approaches may be particularly valuable. Response time has been the preferred dependent variable in cognitive psychology since its inception, and psychometric response-time researchers may gain considerably by considering the results from this more established tradition. As LSAC considers the computerization of the LSAT, issues related to the use of response times should be of continued interest.