ESSA Innovative Assessment Pilot | Center for Assessment

The Challenge in Trying to Be Innovative Under ESSA’s Innovative Assessment Demonstration Authority

I was an early supporter and promoter of the Innovative Assessment Demonstration Authority (IADA) under the Every Student Succeeds Act (ESSA), but I now have serious doubts about the viability of the IADA and its ability to support deep and meaningful educational reform. I remain firmly committed to the need to shift the locus of assessment and accountability from the statehouse to the schoolhouse, but I am not sure if the IADA is the right vehicle for doing so. I outline below some of the key statutory requirements of the IADA and then explain in this post why two of these requirements—comparability and scaling statewide—pose considerable threats to innovation and the success of the IADA. However, as a “glass-half-full” kind of guy, I will use my next post to offer some suggestions for moving forward both in the near- and long-term.

Breaking Down the Innovative Assessment Demonstration Authority

There is no question that ESSA offers more assessment and accountability flexibility than the previous federal law, No Child Left Behind (NCLB). ESSA, like NCLB, requires all of the state’s students in a given grade and subject to take essentially the same test, expanding the definition of the ‘same’ test to include computer-adaptive testing. However, a key opportunity afforded by the IADA allows the U.S. Department of Education (USED) to authorize states to try out different assessment approaches in a subset of districts. With this flexibility comes what USED has termed “safeguards” or “guardrails;” including:

Assessment quality: The system must be comprised of high-quality assessments that support the calculation of valid and reliable scores, provide useful information to relevant stakeholders, and can eventually meet the ESSA assessment requirements as evaluated through the USED’s peer review process.
Comparability: The state must produce student-level annual determinations of student mastery of the state’s grade-level content standards that are comparable across local school districts and to the statewide assessment results.
Scaling statewide: The state must have a logical plan to scale up (that is, implement) the innovative assessment system statewide, and must do so in five to seven years.
Demographically representative: The state must make progress toward achieving high-quality and consistent implementation across demographically-diverse school districts.

Meeting the requirements for any of these safeguards is difficult for any innovative system, but comparability and scaling statewide are the most challenging and, therefore, pose the biggest threat to the success of the IADA.

Concerns with the Requirement to Achieve Comparability As Defined by the IADA

The comparability requirements outlined in the USED’s regulations are:

“(ii) Generate results, including annual summative determinations as defined in paragraph (b)(7) of this section, that are valid, reliable, and comparable, for all students and for each subgroup of students…(USED, 2016, p. 88968).”

The regulations do not indicate how comparable is comparable enough, but based on our experience with New Hampshire’s IADA project, Performance Assessment of Competency Education, it appears that having similar percentages of students being classified as proficient or higher on both the innovative and state assessment will meet this requirement.

The Center for Assessment convened a panel of leading comparability experts to provide comments and suggestions to the USED before these regulations were finalized. The expert panel noted there are many legitimate reasons for non-comparability, particularly:

To measure the state-defined learning targets more efficiently (e.g., reduced testing time);
To measure the learning targets more flexibly (e.g., when students are ready to demonstrate “mastery”);
To measure the learning targets more deeply; or
To measure targets more completely (e.g., listening, speaking, extended research, scientific investigations).

Requiring high levels of comparability with the state test scores might limit these opportunities. States can create approaches to meet these comparability requirements, but if the state’s goal is strong comparability, it runs the risk of giving up on innovation. In fact, Dr. Robert Brennan, one of the world’s leading comparability experts noted, “Perfect agreement would be an indication of failure.”

Challenges with the IADA Requirement for Statewide Implementation

There is a clear tension between statewide implementation and innovation in the IADA. Scaling up any successful education reform innovation in meaningful ways is a considerable challenge, but trying to implement a truly innovative system statewide in seven years is virtually impossible. Think about state efforts to implement new content standards in all schools, such as the Common Core State Standards, and to do so with fidelity. We are more than eight years into this reform and, setting aside political issues, even in states enthusiastically supporting the new standards, most would agree we are still a long way from deep and sustainable implementation.

Some might say, “states change their assessments regularly anyhow, so why can’t they just switch to the innovative assessment system at the end of the demonstration authority?” They can, but the literature is clear about the ineffectiveness of top-down reforms – especially those expecting fundamental changes in teaching and learning. Most of the states pursuing the IADA are interested in not simply implementing a new assessment program, but in taking the opportunity to re-orient teaching and learning. They are capitalizing on performance-based and other local assessments, as in the case of New Hampshire, or, like Louisiana, are focused on more tightly connecting curriculum and assessment. Teachers and school leaders must be true partners in such ambitious reforms for there to be any hope of success.

Therefore, unless changes are made to the law and/or regulations, this scaling requirement will be the Achilles Heel of the IADA – or it will force states to pursue much more modest reforms. If that’s the case, we must ask, “What’s the point?”

I have not given up hope completely and, in my next post, I offer suggestions for some potential design options to thread this needle, as well as recommendations for policy to support real innovation.