Rebuilding School Accountability One Step at a Time

Jun 16, 2021

The Process of Restarting Accountability Systems Will be Incremental, Not Immediate

As the COVID-19 pandemic continues to subside in the United States, it may be reasonable to expect some education activities to resume with relatively little delay or difficulty. But the impact to other aspects of the education system is likely more extensive, requiring a period of rebuilding before complete resumption. Such is the case with rebuilding school accountability systems statewide.  

Why School Accountability Activities Will Resume Incrementally Rather Than Immediately  

Accountability systems developed for the Every Student Succeeds Act (ESSA) rely on a collection of multi-year data, which are fully or partially missing due to pandemic-related disruptions. For example, every state includes either academic improvement (trend), growth (cohort), or both in their accountability system. These indicators rely on state assessment data from the prior year (or multiple years) to gauge change in the current year. Moreover, there are many other components of school accountability systems that rely on data from previous years, such as progress to proficiency for English language learners, test score banking for end-of-course tests, and multi-year averaging for small schools.  

Consider, too, that the ESSA criteria for determining which schools are designated as Targeted Support and Improvement (TSI) includes identifying “consistently underperforming” subgroups using a method that is “informed by all indicators.” In short, when states are missing annual test data from Spring 2020, and there are likely gaps in the quantity and/or quality of test data available from Spring 2021, legacy school accountability systems cannot resume all at once.  

How Can States Develop a Plan to Rebuild Accountability?  

As with any school accountability design initiative, there isn’t a one-size-fits-all solution to restarting that will work for every state. State approaches to implementing ESSA accountability requirements differ widely, as will the impact of pandemic-related disruptions. Additionally, no solution is immune from the challenges described in the previous section. The ideas described in this post are motivated by a sense that we should be more modest (but strategic) about what can be accomplished in the short term and more ambitious about what can be achieved in the long term.  

A good place to start may be establishing some principles to inform the way states frame solutions. In this post, I suggest three ideas to guide accountability rebuilding plans.  

Streamline Systems

School accountability systems in 2022 are unlikely to be able to restore every feature of the legacy model. Therefore, states may wish to focus on a streamlined system that addresses a limited number of high-priority areas, such as differentiating between the schools most urgently in need of support versus all other schools.  

A streamlined system might abandon composite scores in 2021-2022, as well as the full range of ratings that go with it (e.g. letter grades, stars, etc.). Instead, the design priority would be placed on identifying and supporting the highest needs. In this manner, the state can focus on the information and criteria best suited to serve high-priority areas instead of trying to make distinctions that are likely less credible and consequential.

For example, many state accountability models produce a range of school designations from lowest-performing to highest-performing and everything in between. Even in the best of cases, it’s very challenging to meaningfully differentiate performance along the full scale. Suppose a state system produces a composite score for each school that can theoretically range from 1 to 100. How confident can one really be that there is a substantial difference between two schools earning 82 and 83, respectively? However, that level of confidence is likely much stronger when comparing schools with scores of 50 and 100.  

Fit Classification Methods to Support Strategy 

Even with streamlined systems that narrow the focus of classification, it is important to establish criteria for producing school designations that are well-aligned to the support strategies.

To unpack this principle, it may be useful to start by distinguishing among supports with respect to 1) availability and 2) consequences. Availability should be self-evident, which simply refers to whether the support is widely accessible or scarce. Consequences, for present purposes, refers to whether the support leads to restrictions or a loss of flexibility at the district or school level. A more charged way we could describe this distinction is ‘punitive’ versus ‘non-punitive.’ 

The following table provides examples of supports that might be classified into these categories.

Distinguishing among these support types is useful when considering how to structure accountability decisions. Typically, accountability systems are designed to maximize confidence in assigning an accountability classification that triggers supports. That is, we want to be very sure that we don’t say a school needs support unless it does. If we’re wrong, we can think of that as a false positive or a Type I error. The problem, however, with minimizing false positives is that some schools that really need support will not be identified.  

With thanks to my colleague Damian Betebenner, who has helped shape my thinking on this topic, a focus on Type I error may not always be appropriate in light of post-pandemic realities.  In many cases, it is preferable to guard against a false negative, or Type II error – a failure to accurately classify a school as needing support.   

I believe an emphasis on minimizing Type II error is appropriate when dealing with widely-available supports that are not-restrictive. Accordingly, leaders may want to design a system that makes it easy to enter and hard to exit any accountability classification tied to these supports.  

If the support is scarce, but still not-restrictive, there may be a need for another approach.  In these cases, the classification criteria should focus on two primary questions: 1) how many schools can be served and 2) which schools most urgently need and will benefit from the support? For instance, if there are enough instructional coaches to serve half the schools, the bar for entry would be different than if the state could only support 5% of the schools. The design emphasis should be on designating a feasible number of schools that maximize benefit to each. 

Finally, with regard to any support that is restrictive, the classification criteria should reflect the relatively high risk of a Type I error. In fact, there is a persuasive case to be made that suspending the most restrictive consequences altogether is appropriate – at least until a more robust set of multi-year evidence is available to inform such decisions. For whatever restrictive supports may remain, the standard of evidence to be newly-classified should be high and the exit standard for schools already receiving them should be relatively low.  

The following table summarizes the relationship between classification standards and support type.

It’s unlikely that uniform aggregation rules or thresholds will be well-suited to operationalize these variable decision rules. Therefore, states may wish to explore classification strategies based on performance profiles instead of composite scores or rankings, especially for transitional accountability systems. 

Plan for the Longer-Term 

The first two principles, streamlining the system and fitting classification methods to support strategy, primarily address stop-gap solutions for the near-term challenges associated with missing data. However, it is also important to lay the groundwork for a state’s longer-term accountability vision during the transitional period.   

The pandemic disruptions provide a unique opportunity to ‘build back better.’ That process should start by piloting indicators and approaches that may be candidates for inclusion in the next-generation model. For example, states may wish to explore innovations to academic growth, a broader set of indicators to promote college and career readiness, or new approaches to measuring trans-academic skills.   

States that plan to explore these or other ideas should act quickly to establish their longer-term accountability vision and ensure evidence collected in 2021-2022 is appropriate to inform design decisions in subsequent years. This work should be grounded in a theory of action, as my Center colleagues detail in other works.  

A Bright Outlook 

Many are looking forward to the 2021-2022 academic year with optimism. Indeed, after more than a year of significant disruptions in education, it feels good to consider a ‘return to normal’ – or even some better version of normal.  

I believe recent events have created a unique opportunity to create new and improved systems to support schools. While the process will not be quick or easy, investing in careful planning now can lay the groundwork to realize longer-term benefits.