Threats to the Validity of Accountability Systems, Part 3:

Dec 20, 2023

Supporting Formative vs. Summative Feedback

In this last of three blogs, we want to discuss a third potential threat to the validity of accountability systems. Unlike the first two, this threat is less about the design of the system and more about how the state uses and supports the information its system produces. It’s about getting the right balance between formative and summative feedback. An underlying assumption here is that accountability systems are intended to drive continuous improvement to improve student learning. 

Given states’ limited capacity to support evaluation (e.g., money, expertise, human resources), they often must make difficult choices about how they design, implement, and evaluate their accountability systems. A key question when evaluating accountability systems is a little meta: do accountability system users focus more on the summative evaluation it provides (i.e., ratings and results), or on the formative efforts such as school improvement planning, evidence-based intervention selection, and progress monitoring? 

In other words, what’s more valuable: the summative evaluation ratings that accountability systems produce, or the forward-looking formative feedback that often comes after you receive your summative ratings? 

Before answering, we should distinguish between formative and summative evaluations. This distinction comes into play in designing the theory of action, examining the accountability system’s measurement capabilities, and using accountability information to drive improvement. 

Read the first two parts of this series on threats to accountability: Part 1 explored the balance of simplicity vs. complexity in accountability systems. Part 2 discussed the balance of precision vs. actionability.

Formative Evaluation

Formative evaluation is often central to continuous improvement efforts. The focus is on collecting, reviewing, and acting on information in short cycles. Those who participate in formative evaluation are often most interested in examining the extent to which resources are allocated to address specific problems of practice and in monitoring the implementation of system inputs and outputs. 

Questions that inform a formative evaluation might include the following: 

  • Did schools conduct a systematic data review and root cause analysis?
  • Did they develop a school improvement plan to address the root causes of low performance (i.e., leading indicators of longer-term outcomes)?
  • How are they identifying students who need additional support? What are they doing to support these students? How are they supporting students who are on track and/or exceeding performance standards?
  • Are they providing tutoring and/or after-school programming services to reach targeted students? What types of services are being provided? To what extent are the targeted students receiving these services, and at what level?
  • Are programs being implemented with fidelity? Are students showing up? Are they engaged? How do we know?

A formative evaluation focuses on “learning how to improve” instead of learning “what works, for whom, and under what conditions?” For that reason, the information collected and reviewed in a formative evaluation should occur much more frequently (e.g., monthly/quarterly) than in a summative evaluation, often through formalized plan-do-study-act (PDSA) cycles. Data collection should include both quantitative and qualitative information. Often, data are collected less formally and systematically but more frequently. 

A formative evaluation aims to monitor school progress and classroom-based actions that occur daily and ensure they’re coherent with the outcomes in the state’s theory of action. Using data that can be collected relatively frequently and quickly, school staff can examine variations in information such as teacher/student attendance, engagement, and program implementation (such as how effectively did students perform on a unit of study?).

Summative Evaluation

Summative evaluation is much more systematic. It is a retrospective judgment about the extent to which a program (1) worked as intended, (2) was implemented with fidelity, and (3) influenced changes in short and long-term outcomes. Typical questions addressed through a summative evaluation include: 

  • For whom did the program work?
  • Under what conditions does the program produce the most substantial impacts?

Often, a summative evaluation relies on the results of an accountability system (such as standardized test performance, graduation rates, percentages of chronically absent students) to determine whether the theory of action and/or improvement plan is moving the needle on intended outcomes. In other words: In a summative evaluation, accountability system results may be the long-term outcomes to judge the merit of the accountability system’s impact on school improvement.

Balancing Formative and Summative Evaluation

While states, districts, and schools need to balance both formative and summative evaluation methods, it is important to be clear about the purpose and use of each type of method and its associated information. 

Using formative evaluation data will help schools, districts, and states engage in more effective progress monitoring and course correction throughout the year. Formative evaluation can lead to shorter-cycle reviews that focus more closely on improvement and growth. Using summative evaluation data can help corroborate whether the formative information leads to improvement against goals or performance targets. 

Using only summative evaluation methods and data can leave you guessing until the end of a longer cycle or the end of the year. It’s the combination of—and balance between—formative and summative evaluation information that can lead to the greatest understanding of program progress and impact. 

Building in Formative Evaluation to Balance Out the Summative

This section is intended more for those who use the results of accountability systems than for those who develop the systems. Even so, system designers and implementers can use these recommendations to better inform their communications and modeling efforts. Below, I offer three practical steps to think about how formative evaluation can help make the data in an accountability system more actionable, and potentially help cut through the noise of traditionally complex accountability systems: 

  1. Understand the accountability system’s theory of action. What are the signals we should attend to? What are the major outcomes of value? Recognizing that the accountability results are inherently lagging, what are some of the more malleable leading indicators that can signal progress between one report card cycle and the next? 
  1. Connect earlier evidence (measures and metrics) to each indicator in the accountability system. While this is sometimes challenging, we should strive to establish coherent connections that are often included in accountability systems, like performance and progress on state tests, school conditions for learning, and graduation rates. Evidence may include direct observations of counts, completed tasks, or getting individuals to attend some training. In other cases, the evidence is a little more challenging to capture, like surveys, interviews, or document reviews, but still worthwhile.
  2. Once identified, collect the available data and determine the quality of the evidence to support progress monitoring. Incremental progress is critical to monitor because it can allow us to stay the course if an intervention or program is working or course-correct early on if necessary. The right kind of evidence can also build confidence that intermediate activities will lead to intended outcomes. After all, implementation fidelity is often theoretical and impractical. Monitoring progress can help distinguish the steps that are critical from those that might be nice-to-haves. 

In summary, this three-part blog series explored how we can think about the design, development, implementation, and use of accountability systems and their information by discussing three key tensions: (1) simplicity vs. complexity; (2) precision vs. actionability; and (3) supportive summative vs. formative evaluation feedback. This third installment highlighted how these three tensions come to a head as we consider whether accountability systems are useful in driving improvement. 

I invite you to share your thoughts on how we can use accountability systems and their results to drive continuous improvement, where they might be coming up short, and how we can do better.

Read the first two parts of this series on threats to accountability: Part 1 explored the balance of simplicity vs. complexity in accountability systems. Part 2 discussed the balance of precision vs. actionability.