Can We Reduce Testing Time?
Issues and Considerations Regarding Time Spent on Testing
It’s rare to find an issue that nearly everyone agrees with, but I think that the desire to reduce testing time is one of them. If we surveyed students and teachers to ask if they want to cut back on the time spent taking standardized tests, I confidently predict this proposition would receive near-universal support. It’s not hard to understand why. Proponents of reducing testing time are quick to point out that time devoted to testing decreases the available time for instruction.
So, why don’t we just reduce testing time? It appears to be a simple problem, but the right solution may be more complex than expected. I’ll explain by focusing on the factors that can increase or reduce time and the tradeoffs associated with efforts to decrease testing time.
What Do We Mean by Tests?
Before we can start to identify strategies to reduce testing time, it’s important to ask: what tests are we talking about? The tests that come to mind initially may be the summative tests that states administer in grades 3-8 and high school as required by the federal Every Student Succeeds Act (ESSA). These tests are typically administered at the end of the year and are used for school or student accountability purposes.
But students also take other standardized tests during the year for a variety of purposes. These tests may be termed interim or benchmark tests and are usually administered by the school or district. Students in almost all districts spend a lot more time taking interim tests than the end-of-year state tests.
Moreover, there are tests given to specific student groups, such as those used to determine if language learners are eligible for selected supports. Still other tests are made available as options to students, even though they may not be required by schools or districts, such as tests to determine eligibility for gifted programs or tests used for college admissions.
And these are just the ‘external’ tests – or standardized tests – typically developed and scored by third parties, such as testing companies. There are also a host of classroom tests that students encounter on a regular basis.
The wide range of tests students are asked to complete reveals that “excessive testing time” is not limited to a single test or organization. For example, even if the state department of education makes a substantial effort to reduce testing time on its annual summative tests, that effort may not make a big difference in the experience of some students. Serious attempts to reduce testing time must be characterized by a coordinated initiative to take inventory of the full range of tests currently administered, identify the highest priorities, and eliminate tests that are duplicative or of limited value.
This approach is the shortest line to the biggest gain.
What Do We Mean by Testing Time?
This question may seem so obvious it doesn’t merit attention. However, my experience with state summative tests suggests that concerns about testing time are actually motivated by at least three different factors: examination, preparation, and administration.
Examination is straightforward. It refers to the actual time a student spends completing the test.
Preparation indicates the time that teachers and students may spend in or out of class working to create conditions that are thought to maximize a student’s success on the test. In the best case, preparation is characterized by activities that are well-aligned to instructional priorities and promote desired student learning. Unfortunately, preparation may also involve a range of activities that are contrary to desired instruction, such as taking frequent practice tests or engaging in unnecessarily repetitive review sessions (e.g., testing drills). Time spent on these tasks may far exceed examination time. More troubling, excessive time devoted to testing preparation sends the wrong messages about what outcomes are valued and may weaken enthusiasm for learning.
Finally, administration refers to the range of activities that are mostly the responsibility of teachers and leaders associated with supporting the logistics of testing. This laundry list of activities may include:
- participation in administration training,
- reviewing test manuals,
- developing schedules and staffing plans,
- inventorying and organizing test materials,
- identifying and registering students who will be tested,
- determining what accommodations will be needed for individual students,
- installing and checking testing software, or
- packing tests for shipping after administration.
Obviously, the cumulative burden of these administration activities is non-trivial and can indeed interfere with time devoted to instructional priorities.
Recognizing the broader ways that testing can impede or interfere with instruction reveals that examination time is just one part of a larger issue. It may be useful to think of total testing time like air travel. For example, recently I traveled from Boston to Chicago. While the actual flight was only about 2 hours, I had to budget nearly 6 hours of total ‘door-to-door’ travel time. That time includes driving to the airport, parking, getting through security, waiting at the gate, and then taking the train downtown upon arrival. Even if the flight time is reduced by 15-20 minutes, that wouldn’t have made a substantial difference in my overall travel time.
Similarly, shaving 15-20 minutes from examination time may represent a minor decline in total testing time.
Can We Reduce Examination Time?
Even if examination time is just one piece of the puzzle, it’s still an important piece and one that people often focus on when they voice concerns about testing time. Can it be reduced? Probably yes. But there’s always a trade-off.
First, it’s worth noting that examination time and test length are not the same thing. Before we start tinkering with the test itself, it may be valuable to determine if scheduled testing time can be reduced without adversely impacting student performance.
Every state summative test I’m familiar with is designed to be essentially untimed. Usually, developers will determine a time limit by examining the distribution of testing time during a pilot test. For example, if 95% of students without an extended time accommodation complete section one of the test in 60 minutes, that timeframe might be an attractive choice for a time limit. Most students may finish in under 40 minutes, and some may need (or take if available) 90 minutes, but developers look to find a balance between establishing enough time but not so much time that it becomes onerous for teachers and students.
A good way to begin an inquiry about whether scheduled time should be reduced is to re-examine the distributions of completion time for different grades, content areas, student groups, and item types. Examining this information in consultation with the technical advisory committee may reveal where tweaks to testing time are potentially warranted without tinkering with the test length.
Adjusting test length by reducing the number of items on the test is, of course, another strategy to decrease testing time. Naturally, testing time will be reduced if items are removed from the test, especially if one removes or replaces time-consuming items like those that assess writing in response to a reading passage.
There are generally three approaches to reducing test length, which aren’t mutually exclusive, but each comes with a cost.
- Shorten the test by removing items
- Replace items that take longer to complete with items that take less time
- Consider an adaptive model in lieu of fixed form
Removing Test Items to Reduce Testing Time
Any reduction to test length by simply removing items will reduce the test’s reliability. That reduction may not be substantial if a few multiple-choice items are removed. But doing so also won’t reduce the testing time very much. Is shaving 5 minutes off the test worth even a slight reduction in precision?
Replacing Lengthier Test Components With Shorter Items
Replacing items that take longer to complete may seem like an attractive solution. More often than not, however, those time-consuming items address higher-order thinking skills, and removing these items will change what the test measures and could influence the validity of interpretations, as well as comparability to prior administrations of the test. It could also produce the unintended effect of signaling that these higher-order skills are not an instructional priority. Obviously, every situation is different and there are strategies to promote instructional priorities outside of an end-of-year state test. The point is that any endeavor to reduce testing by removing or replacing items should be accompanied by a careful examination of the impact of these choices from a technical, practical, and/or policy perspective.
Using an Adaptive Model to Streamline Testing
The third approach, an adaptive model, refers to creating a test designed to provide items suited to the examinee’s ability. So-called Computer Adaptive Tests (CAT) administer easier or more difficult items based on whether the examinee responded correctly to previous items. By so doing, ideally fewer items are required to determine the student’s achievement level. Some states do, in fact, use CATs for their summative testing program.
It is important to note, however, that developing a CAT program can be time-consuming and expensive. In the best case, it requires a well-developed item bank that can take many years to populate if the items are newly developed. Moreover, because CATs used for high-stakes summative tests must represent key content targets, their adaptivity is constrained by alignment requirements. The upshot is that many state CATs don’t adapt as much in practice as they could in theory.
Finally, consider that items such as writing prompts or performance tasks associated with higher-order thinking skills – the items that most influence testing time – are often administered outside of the CAT.
For education leaders concerned about testing time, what does it look like to move forward? I believe a thoughtful approach will be characterized by the following:
- Determine the highest priorities and uses that assessments must support
- Examine the range of tests that students take with respect to these priorities and the relative burden of each for students, teachers, and leaders (i.e., a testing inventory or audit).
- Eliminate tests that are redundant or of limited value
- Work with contractors, technical advisors, policymakers, and practitioners to examine strategies to improve efficiency with respect to test administration
- Support professional training and development to minimize ineffective or counterproductive test preparation activities
In the end, reducing testing time requires a coordinated effort across multiple organizations. These efforts must include attention to many factors beyond just test length. Moreover, it is important to consider the technical, practical, and/or policy implications of various alternatives to ensure the benefits outweigh any costs and avoid unintended negative consequences.