Threats to the Validity of Accountability Systems, Part 1:

Dec 06, 2023

Simplicity vs. Complexity

Those of us who work in K-12 education have been doing a lot of soul-searching about accountability: what makes it work well, where it stumbles, and how it can improve. One of the ways we see it stumble is that key aspects of accountability design are often off-balance. In this post—and two more to follow—we’ll outline how these imbalances can threaten the validity of accountability systems, and we’ll help you strike the right balance. In this first post, we’ll focus on designing a system that properly balances simplicity with complexity. 

First, though, a quick thought about how one key thing ties this three-part series together: program evaluation.  

You might ask: Isn’t a well-balanced accountability system more of a design issue? Yes, it is. But you also need to know if your system works as you envisioned. For that, you need a good program evaluation plan. Program evaluation works best when the system is well-designed using a solid theory of action.

That’s why you’ll see all these elements crop up as we examine three key areas: (1) simplicity vs. complexity, (2) precision vs. actionability, (3) formative vs. summative feedback, and why getting the right balance is crucial at every stage, from conception through evaluation and revision.

In a recent blog post, my colleague Chris Domaleski provided an overview of the tradeoffs between simplicity and complexity in accountability design and suggested some principles to guide decision-making. In this blog post, I seek to apply these principles in the context of a particular challenge: determining the scope of accountability systems. How broadly can we cast the net of outcomes we want to promote? After all, if we try to build a system that does everything well, it may not do anything very well. In particular, our pursuit of breadth or precision could obscure the messages we’re trying to communicate. 

Let’s dive into the balance of simplicity vs. complexity. When designing a system with this in mind, it’s important to imagine first what you want to be able to evaluate when the system is operational. Going in, what do you want the big takeaways to be? I saw this unfold in my own life recently when I started coaching girls’ soccer.  

Establishing a Plan of Action: Coaching

This year, I started coaching girls’ under-8 and under-10 soccer. At the beginning of the season, I thought about what their biggest takeaways should be, and I’ve discussed it with their parents. Many of the parents thought their kids should be practicing technical skills and ball handling. At this age, I thought they should prioritize positioning, understanding the game’s structure, and practicing game-time decisions like passing, pressing, and defending (all while trying to have fun).  

You can imagine that if I took the approach some of the parents advocated, I’d want to evaluate the outcomes of practice very differently. A few parents were annoyed that the girls weren’t doing toe taps and foot skills (they can work on that on their own with a little homework). In response, I focused on describing the intended outcomes of what they should be learning and being clear about that message. However, I had a clear plan of action.  

Theories of Action and Accountability Systems

One way to ensure you don’t lose the message in the noise is by relying on a theory of action. Generally, a theory of action is a hypothesis about how a system produces desired outcomes. At a minimum, a theory of action should have three parts:  

  • Desired outcomes
  • Resources or inputs
  • Mechanisms or activities that are expected to produce the desired outputs (the scores from the system) and outcomes (the impactful changes in practice or behaviors).

The more complex a theory of action becomes, the more accurately it may reflect the processes that will produce the intended outcomes. But increased complexity also means that we need to increase the number of components we are trying to track and measure. While theories of action are important, they are not enough to actually get a program off the ground. 

Theories of action are often nested within larger systems or other theories of action. In the case of accountability systems, you can think of the information produced by the system as an output. Accountability system outputs may include results of individual indicators, school rankings, school grades, and/or identification labels such as CSI, TSI, and ATSI designations.  

The outputs of a well-designed accountability system should holistically reflect a robust definition of a high-quality school. Defining school quality too narrowly can lead to a narrowing of the curriculum, an over-emphasis on test results, or other unintended consequences. Defining school quality too broadly can confuse districts and schools about which areas to prioritize. How a state or system defines school quality should shape its theory of action. Ultimately, the state’s definition of school quality should inform the measures and indicators it includes in its accountability system. These indicators, in turn, become signals of the school outcomes it values.  

The number of measures a state selects for its accountability system to determine “quality” will eventually reach a tipping point. Initially, adding measures to an accountability system can promote a deeper and more comprehensive understanding of school quality.  

However, each additional measure adds an element of complexity and potential error. This error will be magnified if there are too many transformations (e.g., standardizing, norming, multiplying, or weighting measures), making interpretation more difficult. The comprehensive set of measures should, in theory, provide a more holistic snapshot of school quality across a range of essential characteristics. This is consistent with one of the principles Chris wrote about: avoiding superfluous improvements

Losing the Message in the Noise

When taken too far, the system can become so convoluted that users can no longer meaningfully interpret its results. Additionally, too many measures applied across several quality domains may inhibit schools and districts from using their local measures. This is because schools and districts (1) may not have the time and space to address both the accountability system indicators and their local data, or (2) may be distracted or disincentivized from interrogating their own local context and progress-monitoring efforts.  

For example, some states incorporate several career- and college-readiness indicators into their statewide accountability systems. Doing so has at least two potential downsides: (1) local systems may prioritize actions that improve their ratings at the cost of ignoring other actions that may be more valuable to local communities and students, and (2) each additional indicator de-emphasizes the other indicators in the system. In other words, “when everything is important, then nothing is.”  

Striking the Right Balance

Consider the balance between the simplicity and complexity of an accountability system as a potential threat to its validity. If it is too simple, the system may not yield accurate or credible results. If it is too complex, the message gets lost in the transformation or number of measures. To simplify (pun intended) Albert Einstein’s words: Everything should be made as simple as possible, but not simpler (p. 165, para. 2).  

The same is true for accountability systems. Evaluating school quality is a complex task. Evaluating the system that attempts to evaluate school quality can get even more complicated. By focusing on the core message as intended by your theory of action, you can better determine whether your system is as simple as it should be to provide credible, accurate, and fair enough results that you adhere to the intent of your message.  

