Establishing Scoring Rules to Make the Use of TEIs More Efficient and Effective
Technology-Enhanced Items (TEIs) are a kind of test question or task. A characteristic feature of TEIs is that, in contrast to traditional multiple-choice (MC) items, which require the selection or “bubbling” of a single option, TEIs generally require test-takers to make more than one interaction with the item.
The most interesting TEIs are simulations with game-like contexts. Picture a virtual laboratory where the goal is to isolate a specific compound, or a simulated garden where the test-taker can conduct an experiment to learn about (or be tested on) a concept in genetics.
Simpler TEIs, which make up the majority in actual testing practice, include tasks that require test-takers to order objects from least to greatest, or pair objects from one list to objects in another list, or to select objects from a larger set.
Technology-Enhanced Items as a Way to Improve Testing
TEIs emerged with innovations such as computerized testing and automated scoring. Test sponsors and developers have supported the use of TEIs largely because these items are perceived to improve the match between what is tested and domains of interest – the argument being that they are better measures of what test-takers know and can do.
TEIs made their debut in the certification and licensure arena back in the 1970s, and their expansion into K-12 operational testing programs has grown over the past decade.
There are several testing areas where TEIs are currently being used:
- The Smarter Balanced and Partnership for Assessment of Readiness for College and Careers (PARCC) assessments include TEIs.
- In 2016, the National Assessment of Educational Progress (NAEP) conducted a pilot introducing several kinds of TEIs into its mathematics tests for grades 4 and 8.
- The Programme for International Student Assessment (PISA) by the Organisation for Economic Co-operation and Development (OECD) also makes extensive use of TEIs.
Three Approaches to Scoring TEIs
Scoring MC test questions is simple – a response is credited if and only if the correct option is selected and no other option is selected. With TEIs, the best scoring rule or procedure is not so readily apparent.
There are three general ways to score a TEI – by trained raters, by expert rubric, or by uniform rule.
Approach 1: Use trained raters for each response
Trained rater scoring requires the development of a scoring guide (also called a rubric), training and monitoring of human scorers, and the implementation of other checks to ensure consistency and fairness. This type of scoring is usually reserved for TEIs that have a “free response” component, where test-takers must produce text, speech, or free drawing to furnish an answer. The technology for trained rater scoring is not new; putting it in practice remains one of the most expensive aspects of testing.
Approach 2: Develop a rubric for each TEI that can then be implemented by a computer
With expert rubric scoring, content experts convene to think about all the ways a test-taker might respond to a TEI. They assign the highest possible number of points to a perfect response and then decide what other responses merit partial credit, and how much. In this process, experts focus on certain features of the response (example: Did the test taker select all the chemicals and equipment they should have selected from the virtual supply room? Did they properly conduct the first stage of the genetics experiment before attempting the other stages?) These actions result in a rubric, which for most TEIs can be programmed into a test administration system to produce scores for any possible response, without any subsequent human intervention in the process.
Expert rubric scoring is most appropriate for TEIs that have many interrelated components, multiple possible “perfect” responses, or where the important outcomes are intricately bound in the sequence of the test-taker’s interactions. Simulations are the prime candidates for expert rubric scoring.
Approach 3: Develop a rule that can be applied to an entire class of TEIs
The simplest scoring procedure for TEIs – uniform rule scoring – is the least expensive, the easiest to explain to test-takers, and possibly the most promising for most of today’s TEIs. Uniform rule scoring is just like expert rubric scoring, but instead of developing a scoring rule for a single TEI, the target is an entire class of TEIs.
For example, suppose you have a test item that asks test-takers read a specific article from an online newspaper and to select, from a list of five statements, all the conclusions that are supported by the article. Let’s say that two of the statements are supported conclusions and the other three are not. Instead of developing an expert rubric for this item, content and scoring experts might instead consider questions such as:
- How might this item differ from another item that has five options and two correct answers?
- Without knowing the details of the item, what is a defensible way to score a multi-select multiple-choice (MCMS) test item with five options and two correct options (i.e., key options)?
- Does the proposed scoring rule generalize to the case where only one answer is correct? How about three or four? Is it problematic to have a five-option test question where all five options are correct? What about one where zero options are correct? What refinements to the rule are needed at this point?
- Does the refined scoring rule generalize to different numbers of total options?
- What are the key assumptions about item choices that must hold for this scoring rule to be implemented? What guidelines can be provided to item authors to ensure that a uniform scoring rule can be applied to any MCMS items that they develop?
- What expectations do test takers bring to MCMS items? What are the implications for developing clear test directions?
The result of this exercise, if successful, is a uniform rule for scoring all MCMS items, not just the one under consideration. Developing a uniform rule for scoring MCMS TEIs has several lasting benefits that outweigh the upfront effort:
Benefit 1: A scoring solution for many TEIs
The rule generalizes to any TEI where the underlying task is the same as the one for which the rule was developed. In the case of MCMS items, a properly-formulated rule would apply whenever the underlying task is to select some things from a larger set of things. It doesn’t matter if the selection is by ticking boxes, by rearranging the direction of arrows in a diagram, or by dragging and dropping objects.
Benefit 2: A principled approach to weighing TEIs
A uniform scoring rule readily provides the maximum number of scoring levels (“points”, if you will) that MCMS items with a given combination of options and key options can logically support. And since today’s test administration systems collect extensive raw data, the effect of, say, scoring using a three-point rubric can be evaluated against scoring with a (reduced) two-point rubric, in terms of item statistics, test reliability, and score information across the range of student ability.
Benefit 3: Clarity for test-takers
Familiarity with general rules for scoring TEIs will help test-takers develop a good sense for them over time, just as they have for MC items. Test-takers will understand, for example, that MCMS items might count for more than an MC item, that selecting a correct option counts as much as leaving an incorrect option blank, and that these items cannot be gamed. In other words, having uniform scoring rules means test-taker attention is more likely to be focused on working through the item as intended.
Psychometric Research Supports Uniform Scoring Rules for TEIs
To get the most from technology-enhanced items, it helps to approach the question of scoring them in a general, systematic way. Research shows that scoring rules that offer partial credit and that generalize to a class of TEIs – not just isolated items – bear fruit in terms of cost savings and information about what test-takers know and can do.
Thus, this post answers an important question about whether test-takers are getting the most out of TEIs: maybe not. The good news is that taking greater advantage of what TEIs have to offer is within reach.