Innovative Assessment Demonstration Authority

Discussions with State Leaders Approved to Implement Innovative Assessment Demonstration Authority Pilots

Five states have been approved to implement innovative assessment pilots as part of the Innovative Assessment Demonstration Authority (IADA) under the Every Student Succeeds Act (ESSA). As part of our 2021 Reidy Interactive Lecture Series (RILS) on assessment innovation, we thought it was critical to hear directly from these IADA leaders.

Louisiana and New Hampshire were the first to be approved for initial implementation in 2018-2019, with North Carolina and Georgia following in the 2019-2020 school year, and Massachusetts in 2020-2021 (see this summary for an overview of all five programs). Unfortunately, we were unable to interview the leaders from one of Georgia’s two pilot programs (NAAVY) and we chose not to interview the NH leaders because their program is “paused.”

We structured our interviews with the state leaders by asking these three guiding questions:

What have been the main challenges related to IADA implementation?
How has your state planned to address challenges related to statewide scaling, psychometric scaling, and/or comparability?
What advice would you give to other states interested in applying for the Demonstration Authority pilot under Section 1204?

Innovative Assessment Demonstration Authority Requirements

We briefly recap key IADA requirements and then provide highlights from the interviews.

The IADA allows the U.S. Department of Education (USED) to authorize up to seven states to try out different assessment approaches in a subset of districts for up to seven years as long as state pilots can meet specific requirements including:

Assessment Quality: The innovative system must be comprised of high-quality assessments that support the calculation of valid and reliable annual determinations of student proficiency and accountability reporting requirements (e.g., growth and subgroup reporting). The innovative assessment system must eventually meet the ESSA assessment requirements as evaluated through the USED’s peer review process.
Comparability: The innovative system must produce student-level annual determinations of student proficiency of the state’s grade-level content standards that are comparable to the statewide assessment results.
Scaling Statewide: The state must have a logical plan to scale up the innovative system statewide within and after the five-year timeline, although there is no requirement that the state scale statewide under IADA.
Demographically Representative: The innovative system must include pilot districts/schools that demographically represent the diversity of students present in the state.

We highlight key responses below organized by the seven major themes that emerged from the interviews. Each section includes a link to a short video from the interviews.

Tensions Between Instructional and Accountability Uses

Most of the innovative initiatives were conceived to provide instructionally useful information through the year in addition to data for accountability. Several participants mentioned the challenges of trying to meet these potentially conflicting purposes in a single system. For example, Allison Timberlake, Georgia’s Deputy Superintendent, asked, “How do we give real-time formative data for teachers while at the same time rolling that up and making high-stakes determinations?” Our colleague, Brian Gong, illustrated these challenges in two recent posts. Similarly, Tammy Howard, North Carolina’s Director of Accountability, noted that their model, built on the NC Check-In interim assessment system, has been promoted as a classroom-level assessment to support instructional purposes, but they are concerned the accountability uses might distort instructional uses.

Watch: Tensions Over Multiple, Intended Uses: Informing Instruction and Summative Accountability

Scaling the Innovation Statewide

In our experience, scaling the innovation statewide is one of the biggest technical and policy challenges facing IADA state leaders.

Michael Huneke, from Marietta City Schools in Georgia, noted the number of “road shows” the core GMAP districts were conducting to recruit new districts into their pilot. The downside is the amount of effort required, but the positive consequences are the grassroots support generated. He was pleased they are approaching 10% of the Georgia student population, but he knows they have a long way to go.

Louisiana’s pilot is based on using assessments tied explicitly to the curriculum students are taught. This works in Louisiana because so many schools use the same open access curriculum, but that does not mean they all implement it the same way. Louisiana’s project leaders are in the process of adapting the innovative assessment system to another popular curriculum to aid in scaling efforts. However, in addition to COVID, the constant battering of hurricanes over the past several years has hindered school leaders’ interest in trying new things.

The scaling approach used by Massachusetts is similar to the design innovation being discussed at RILS this year. Instead of creating a design early in the project and then scaling it to an increasing number of districts, Massachusetts leaders are using a “Little Bets” approach whereby they are continuing to test and refine their proposed design during the early years of the project until they feel like they have something worth scaling.

Watch: Scaling the Innovation Statewide

Psychometric Challenges

Producing “comparable annual determinations” is a key IADA requirement. Each of the five approved IADA systems are either through year designs that gather information over the course of the school year (LA, NH, NC, GA) or have multiple components administered at the end of the year (MA). In either case, the multiple measures must be combined in the most valid way possible. While aggregating the multiple data sources sounds straightforward, it can tie psychometricians in knots, as noted recently by Brian Gong.

Further, any system relying on information gleaned from assessments prior to the end of year must consider how to deal with missing data due to absences, student mobility, or other reasons. Kinge Mbella, North Carolina’s senior psychometrician, reported they adjusted their design to make their summative assessment “more flexible” to allow students to receive a valid score even if they have not participated in any of the interims.

Watch: Psychometric Challenges

Comparability Challenges

IADA requires comparability at the level of annual determinations between the state’s legacy assessment system and the innovative assessment system. However, comparability is a multi-dimensional requirement involving more than achievement-level comparisons (e.g., Lyons & Marion, 2016; Evans and Lyons, 2017). It involves comparability among the learning targets, accessibility provisions, and many other aspects of test design.

The more innovative the IADA system, the more challenging it is to meet IADA comparability requirements. As stated by Garron Gianopulos, an NWEA psychometrician supporting the GA GMAP system, “how innovative can we get with the scoring given comparability requirements?”

Both North Carolina and Massachusetts have innovative systems that rely on the state’s traditional end-of-year assessment to establish comparability between results from the innovative and existing state assessments. Georgia’s GMAP has yet to establish their psychometric scaling and comparability plan, but it will likely need to take the same approach. Louisiana is taking a somewhat different approach by embedding a set of state assessment passages and items into the innovative system in order to evaluate comparability among the systems.

Watch: Comparability Challenges

Advice to States

The IADA leaders had plenty of advice for other state leaders thinking about applying for this flexibility. Allison Timberlake offered simple, yet critical advice: “Consider why you are applying, what kind of assessment system you want to build, and what goals you want the system to meet.” She went on to note that you can innovate without the IADA and do so without the technical constraints such as comparability, a point echoed by several other leaders.

Many of the IADA leaders emphasized the importance of getting wide-spread buy-in before and throughout the Authority period. The North Carolina team noted that state leaders need to think broadly and collect input from top policy leaders, teachers, parents, and community members in addition to the typical outreach to school and district leaders. Kinge Mbella, North Carolina’s lead psychometrician, cautioned that IADA designers need to “decide ahead about what to do with stakeholder input because it may be contradictory or not possible psychometrically. You have to be careful not to overpromise so you can manage expectations if it has to be comparable to the current system.”

All of the state participants warned potential state IADA applicants to be ready to take on a substantial amount of extra work without any additional funding. They noted states must allocate appropriate staff and financial resources if the program has a chance of succeeding. Both Chanda Johnson and Sam Ribnick, the leads in Louisiana and Massachusetts, respectively, noted that it is important to allocate at least one full-time staff member to lead, organize, and take on the day-to-day responsibility for the innovative system. This is tough to do without extra funding but must be part of the initial planning process.

Additionally, Sam Ribnick pointed out that if your IADA requires support from testing companies, you must learn how to write requests for proposals in ways to push assessment professionals to be innovative and have the money to support these testing contracts.

Watch: Advice to States

Advice to the U.S. Department of Education

While not one of our initial questions, most of our interviewees had advice for the U.S. Department of Education and other federal policymakers. Several participants discussed how the accountability requirements, which are the same under IADA as the regular system, significantly constrain the innovative design.

Allison Timberlake pointed out the states “can’t be innovative with an assessment system when it has to be tied to high stakes uses.” Sam Ribnick emphasized this point, “You can’t have it both ways. States can’t meet every peer review bar and also be innovative.”

Tammy Howard and other state leaders pointed out that it would have been helpful to have a formal sharing structure with other states, sort of like a community of practice among IADA states, especially those doing something similar like Georgia (GMAP) and North Carolina. In addition to convenings, these state representatives would have welcomed technical assistance from USED. These leaders also suggested that such a community of practice could have better negotiated with USED about potentially relaxing certain requirements.

Watch: Advice to the US Department of Education

Tensions, Trade-Offs, and Innovation in Systemic Reform

State leaders offered important insight regarding the systemic nature of this kind of reform. Sam Ribnick noted that this assessment innovation is supporting a major Deeper Learning initiative in Massachusetts, called Kaleidoscope.

Allison Timberlake emphasized that the innovative assessment design has implications for teaching and learning because educators make different instructional decisions based on how students are assessed. “It is not just which test they like better, but how should the standards be interpreted and implemented in our state.” She went on to make a simple, but often overlooked, point in assessment design, “It is good to think about teaching and learning when designing an assessment system.”

On a related note, Tammy Howard urged potential innovators to “consider the impacts on your current system.” Educators and policymakers appreciate stability, and in North Carolina, their current system goes back to the 1990s. Innovators need to understand this potential disruption, although many argue, the system needs to be disrupted.

Watch: Tensions, Trade Offs, and Innovation

Our Concluding Thoughts

The Center has been closely involved in IADA since its inception in 2015. We and our Center colleagues have written extensively about the IADA, including these recent posts by Carla and Scott. We are privileged to have learned from these committed and insightful educational leaders who have dealt with the challenges of the pandemic in addition to the work associated with innovation.

In spite of the challenges we illustrated throughout this document, we are thankful these leaders are digging into this work because they are devoted to improving teaching and learning in their states and districts. Chanda Johnson said it best: “It’s the work I’ve done that feels like it’s going to have the biggest difference for the most kids, and that’s really exciting.”