Evaluating Interim Assessments Against the Criteria for Balanced Assessment Systems
Interim assessments may have a role in balanced assessment systems, but that role is not conferred by title. It is conferred by logic and evidence tied to particular purposes and uses.
To explain, let me take you back to a decade ago. Marianne Perie, Brian Gong, and I published an article in which we attempted to clarify the definition and uses of interim assessments and made the argument that they can be an important piece of a balanced assessment system (which we refer to in the paper as a comprehensive assessment system) that includes formative, interim, and summative assessments (Perie, Marion, Gong, 2009).
Unfortunately, the title of our well-cited paper, Moving towards a comprehensive assessment system: A framework for considering interim assessments, suggested that interim assessments were a required component of comprehensive assessment systems. That implication was not our intent and that’s not what the text of the paper says.
Parsing Out the Different Types of Assessments
One of the main reasons for writing the 2009 article was to help provide some conceptual and linguistic clarity around the plethora of assessments being called anything from formative to diagnostic to benchmark (see Calvary Diggs’ recent post for an update on the language challenges).
We were especially interested in separating formative from other assessment types. There was, and continues to be, a large body of research documenting the effectiveness of specific types of formative assessment approaches that are quite different from the commercial assessments being marketed as “formative.”
We are pleased our paper appears to have supported that distinction but remain concerned that we might have inadvertently suggested balanced assessment systems must be composed of formative, interim, and summative assessments, like selections from a prix fixe menu.
So, How do Interim Assessments Fit Within a Balanced System of Assessment?
The Center maintains a strong interest in interim assessments, which is why this year’s Reidy Interactive Lecture Series (RILS) focused on helping states and districts improve how they select, use, and evaluate interim assessments.
In our paper, we defined interim assessments as:
Assessments administered during instruction to evaluate students’ knowledge and skills relative to a specific set of academic goals in order to inform policymaker or educator decisions at the classroom, school, or district level. The specific interim assessment designs are driven by the purposes and intended uses, but the results of any interim assessment must be reported in a manner allowing aggregation across students, occasions, or concepts (emphasis added, Perie, et al., 2009, p.6).
I highlighted the importance of purposes and intended uses from our 2009 definition above because the lack of specificity and lack of evidence regarding intended uses remains one of the biggest concerns about the proliferation of commercial interim assessments. The Center’s Chris Domaleski discussed this challenge in his closing remarks at RILS, and his comments have been eloquently summarized here.
Further, Brian Gong persuasively articulated in this paper and for a recent Centerline post the criticality of specifying the intended uses of an interim assessment in order to evaluate claims about the assessment results.
The limited evidence tied to stated purposes and uses gets at the heart of my concerns about the fit of interim assessments within balanced assessment systems. To be fair, we have similar concerns about over-promises with large-scale summative assessments, but that’s a topic for another post.
Addressing Our Concerns About the Role of Interim Assessments Within Balanced Assessment Systems
Well-designed and thoughtfully-selected interim assessments may support the balance of district and state assessment systems. My concerns about interim assessments are due in part because of the prominent place they occupy in the education assessment space. This concern led my colleagues and me to question whether commercial interim assessments could play a useful role in balanced systems of assessment (Marion, Thompson, Evans, Martineau, & Dadey).
In this recent paper (A Tricky Balance), we reviewed the key criteria—coherence, comprehensiveness, continuity, efficiency, and utility—that define a balanced assessment system (NRC, 2001). All are important, but coherence sheds light on whether or how interim assessments fit within a balanced assessment system. We discussed both vertical and horizontal coherence, noting an assessment system is vertically coherent when there is compatibility among the models of student learning underlying the system’s various assessments (NRC, 2006). Horizontal coherence is the alignment among curriculum, instruction, and assessment with the goal of helping students develop proficiency in a content domain (NRC, 2006).
Both vertical and horizontal coherence is necessary for assessment systems to be balanced, but both are difficult to achieve when commercial interim assessments are included as part of the mix of district assessments. If such assessments are based on an explicit model of learning (and it is not clear that most are), it is incredibly unlikely for the same model of learning to be found in each district where the assessments are being used. Therefore, the interim assessment results could be sending mixed or even incorrect signals about what students have learned relative to how they were expected to develop in a content domain.
Horizontal coherence is just as important because the assessment must be well-aligned with the enacted curriculum and instruction to support grounded interpretations of student learning. Horizontal coherence is necessary to support interventions for specific students or for evaluating/monitoring larger-scale programs.
Interim Assessments Need a Defined Role That Supports Educators
My focus on coherence and specificity of claims reveals my pessimism about the role of interim assessments in balanced systems of assessment. What would it take to change my view? First, building off of Chris Domaleski’s latest post, assessment providers and users need to clearly articulate their intended uses and provide evidence to support those uses. To illustrate one purpose—instruction—I previously named five features necessary for innovative assessments to support instructional uses.
Brian Gong went further by arguing that users and providers need to describe the specific instructional approaches prior to identifying the assessment features necessary to support such uses. For example, determining what students have learned in a unit just taught requires a different assessment design than understanding their preparation for an upcoming unit – even though both would qualify as instructional uses.
Beyond meeting coherence requirements and having evidence to support specific uses, interim assessments must have a defined role in the system that does not lead to unintended negative consequences such as the delegitimizing of teachers’ assessment knowledge and interpretations. In other words, interim assessments should support increasing teachers’ assessment literacy and not be seen as the official record of student learning in spite of considerable information available from classroom assessments.
Finally, interim assessments need to straddle the conflicting demands of providing a 30,000-foot comparative view compared with customized information to meet each school and district’s needs. Clearly, to satisfy many purposes, the custom approach needs to be prioritized, but that is not how most commercial interim assessments operate.
Therefore, I return to where I began; commercial interim assessments have a limited role, at best, in balanced systems of assessment, and any role must be supported by positive evidence that outweighs negative consequences.