The Best Things Come in Threes

Dec 17, 2019

Evaluating the Utility of Educational Accountability Systems: Focusing on the Link Between Accountability Identification and Improvement Using a Three-Step Approach

Depending on your slant, you probably have a favorite among the many sayings around the kinds of things that come in threes. Some people focus on the belief that tragedies occur in threes. I prefer to focus on how some of the best things come in threes:

  1. Freud’s id, ego, and superego
  2. The three books in Tolkien’s Lord of the Rings series
  3. The 90s hit teen pop sensation Hanson 

Okay, some might argue with the three Hanson brothers – how about The Jimi Hendrix Experience instead? 

A colleague of mine recently stated at a meeting that there is a certain power to speaking in threes (Guarino, 2018): “Get your point across in 3 messages, and make them short, precise, and ‘sticky.’ We tend to remember things in threes.” 

I’ve written a few blog posts on the use and value of program evaluation when examining accountability systems, but many of these describe multi-stepped and multi-layered processes. 

In honor of 3 key ideas, I’d like to highlight a three-step approach to evaluating the utility of accountability systems from a 2017 paper my colleagues and I wrote for the Council of Chief State School Officers (CCSSO). We recommend that states should engage in the evaluation of accountability and support systems by examining: 

  1. The reliability of accountability scores and designations; 
  2. The impact of design decisions on intended outcomes; and 
  3. The link between accountability data and changed behavior. 

My thinking has evolved since writing this 2017 paper, but the main ideas persist. Effective use of accountability systems and their data require us to be clear about the stories we are trying to tell using data. These data stories help connect the dots from accountability data to activities associated with improvement strategies. I have provided brief descriptions below of the three approaches from the paper.

Approach 1: Determining Reliability in Accountability Systems

What do we mean by reliability in accountability systems? Evaluating the reliability of accountability scores and school designations begins with understanding the impact of measurement and sampling issues on system indicators and ends with confirming that measures (i.e., data elements) are functioning as intended. Evaluating the reliability of components across the system can determine the impact the component indicators have on the overall system. These indicators and system results can be compared to historical or projected data to determine the volatility and consistency of school classifications. 

I’m actually pretty excited about some work that I’ve undertaken with the U.S. Department of Education and the American Institutes for Research (AIR) State Support Network that focuses on a whole series of recommended actions states can take to evaluate system reliability (really attempting to build a case for reliability from part to whole—think of it like nesting dolls of consistency and reliability) that I’ll be writing about once it is publicly released. 

Approach 2: How Design Decisions impact Intended Outcomes

We must first have a clear understanding of the interrelationships and dependencies of our systems in order to make claims about utility and impact. That is, what kind of intended (or unintended) triggers or consequences in the identification process lead to other activities or requirements? This determination can be accomplished by examining how exit criteria are defined, how criteria relate to observed changes over time, and how leading indicators are linked to long-term information typically used as outcomes. 

This information supports a stronger understanding of the relationship between accountability designations and outcome changes. Evidence of this connection is critical for determining whether the accountability system is supporting the state’s theory of action. Collecting evidence that supports system claims is key to ensuring the system is valid for its intended use (see my previous CenterLine post for detailed suggestions on how to approach this). Ultimately, we want to ensure that results are valid based on our design, desired signals, and intended uses.

Approach 3: The Link between Accountability Data and Changes in Behavior

Assessing utility and impact is the most challenging aspect of accountability system design. We need to establish a link between accountability data, local behaviors, and outcome improvement to substantiate claims about system impact or utility. 

Well-defined systems of improvement are critical to this effort and should be based on relevant evidence-based practices. However, it is critical to understand the degree to which practices and strategies are appropriate and useful in local contexts. Three ways (see?? another set of three!) this can be accomplished is to: 

  1. Leverage existence proofs of success, or those cases for which you already have evidence of success; 
  2. Learn from those context-specific instances to identify the outcomes that are dependent on context and those that may be (more) context independent; and 
  3. Monitor the application of those strategies to high-needs schools aligned to state education agency (SEA) identification plans. 

The lessons learned from these cases can then be applied to other situations as the accountability system matures and the system is able to identify low performing schools in the next cycle. More detailed recommendations are made in the 2017 paper. 

Bringing it together: Using these Three Steps to Inform System Improvements

Taken together, these steps help build an argument that either confirms that system connections are coherent or reveals that the system needs adjustments. As we continue to learn how state accountability systems function in conjunction with their support systems, we can determine how dramatic system adjustments need to be to promote better practice. Those adjustments may be to measures, indicators and their weights, triggers for support, or improvement strategy implementation. We should monitor the results of these adjustments by examining changes in, guess what? Three ways:  

  1. Triage schools for support using a combination of accountability and low-stakes data;
  2. Build capacity to engage in needs assessments and implement practice with fidelity; and 
  3. Recognize and share practices that lead to sustainable improvement. 

By developing a comprehensive evaluation strategy for both the identification and utility associated with an accountability system, we can leverage existing or forge new partnerships to determine the impact and efficiency of educational improvement efforts.


D’Brot, J., Lyons, S., & Landl, E. (2017). State systems of identification and support under ESSA: Evaluating identification methods and results in an accountability system. Washington, DC: Council of Chief State School Officers. 

Guarino, H. (2018). Communicating about assessment results. Presentation to the Technical Issues in Large Scale Assessment State Collaborative. October 16, 2018: Boston, MA.