Applying Through Year Assessments | Center for Assessment

Assumptions, Claims, and the Evidence Needed to Support Them

Assumptions lead to claims about the types of inferences and uses that an assessment system is intended to support. Strong assumptions require strong evidence. This axiom holds true for educational measurement and most other scientific endeavors. The current interest in through year assessments appears to be based on very strong assumptions about the multiple purposes through year assessments may serve. Is there evidence to support these assumptions? The Center for Assessment hosted a virtual convening in November 2021 to explore these questions.

What is a Through Year Assessment?

“Through year” or “through course” assessment systems (we use “through year” at the Center) are rapidly proliferating. There are at least ten states and associated assessment companies in various stages of exploration, design, and/or development of these systems. Florida is the most recent entree into the world of through year assessment using what the Governor referred to as a “progress monitoring” system.

Nathan Dadey and Brian Gong (in press) defined through year assessment as follows:

Administered through multiple, distinct administrations across a school year, and
Meant to support both (a) the production and use of a summative determination, and (b) one additional goal.

In other words, a through year assessment system involves a distributed design that is meant to support some specific goal(s), in addition to the creation of a summative determination of student proficiency on state content standards, as required by current federal law.

Current Examples of Through Year Assessment Systems

Many people picture the administration of three or four independent assessments per year as the prototypical through year design, consistent with the typical approach used by districts administering commercial interim assessments. However, there are many design variations including

Georgia’s Navvy assessment system, which is an almost weekly diagnostic classification modeling (DCM) system.
North Carolina’s and Nebraska’s “delayed multi-stage adaptive” approaches,
Louisiana’s curriculum-embedded assessments, and
Kansas’ progress to proficiency model.

No doubt, additional designs will emerge in the near future.

Understanding The Additional Goal of Through Year Assessment

Crafting a successful through year assessment system design involves determining what goals are to be prioritized and acknowledging that not all goals can be met within a single system.

We must understand the goal(s) behind a through year system in order to fairly understand and evaluate the design. We have observed several common goals expressed by state and district leaders for pursuing through year assessment.

State leaders hope to reduce the burden of a single test administration. The summative assessment can be spread out so that the end-of-year test does not have to be as long as it would be if everything was assessed at once. This goal was the motivation behind PARCC’s original through year design. That design was never operationalized because the participating states were concerned about the potential influence of the assessment on curricular sequencing in their states and the time requirements necessary for multiple administrations.
State leaders hope to provide “instructionally useful” information to educators and students throughout the year to enhance the value proposition of state tests beyond accountability. The design could include a flexible schedule for the administration of through year system components by districts across the year to better align with local scope and sequences.
State leaders hope to build an assessment system more coherent than the current constellation of an end-of-year state summative assessment paired with the wide variety of interim assessments used in most states. Through year systems represent an attempt, in a “loose-coupling” sense, to help districts create more balanced systems of assessments than is currently the case.

Evidence Needed to Support the Assumptions and Claims Behind the Assessment System

Designers first need to clearly describe the “problem” they hope to solve or the specific goals they hope to address. This is the first step in designing an approach for collecting and evaluating whether the evidence is sufficient to support the claims and assumptions.

The formal way that we, in educational measurement, lay out the claims and evidence associated with a testing program is through what Michael Kane called an interpretation and use argument (IUA). The IUA is then evaluated with a validity argument. Basically, an IUA requires outlining the various claims for (and against) the proposed score interpretations and uses and then collecting evidence to evaluate the veracity of these claims.

Brian Gong, in two recent posts, outlined some of the challenges associated with some of the proposed claims for through year systems, such as those related to the comparability of inferences we make about student achievement from assessments given throughout the school year to those we would make from a single, end-of-year test. Will Lorié took a creative angle to examine similar issues in his Educating Pablo post.

Claims of Instructional Utility

There are many claims and potential inferences that must be evaluated in any through year design. However, I am most interested in the claim and assumption that the results generated from components of the summative assessment throughout the year will productively support instruction and learning. I have written previously about what it might take for assessments to help improve teaching and I have identified at least five design features essential if assessment results are to guide instruction.

Very briefly, those claiming that the components of the system are intended to influence instruction through the year should first use a theory of action or other heuristic to specify how the assessment results are to improve instruction—such as, the teacher should be able to gain insights at a small enough grain size to target skills and knowledge students did not yet grasp and/or what the student needs to learn next to maximize their progress.

Then they should design a program of research to evaluate those intended uses of the assessment results (it’s OK to start small). Importantly, a validity evaluation into the purported instructional uses of through year components must include a critical examination of the potential unintended negative consequences that might result from trying to use the same assessments for both instructional and accountability uses.

Of course, the brief description offered here is at a very high level. More details are in a forthcoming paper and at our conference.

A Call to Action and an Invitation

Those advocating through year systems have a responsibility to produce evidence to support their major assumptions and we all have a responsibility to evaluate the evidence fairly.

I have not yet seen the evidence to support many of the assumptions associated with through year designs. Perhaps that is because these programs are too new to generate the necessary evidence.