Is Your Accountability System Working the Way It Should?
And How Do You Know?
This is the third in a series of posts by our 2024 summer interns, based on the assessment and accountability projects they designed with their Center mentors. Andrew T. Krist, a doctoral student at the University of Alabama, worked with the Center’s executive director, Scott Marion.
All states have developed and operationalized school accountability systems to meet the requirements of the Every Student Succeeds Act (ESSA). These accountability systems must be approved by the U.S. Department of Education. Unfortunately, once they are designed and approved, few states undertake comprehensive evaluations to investigate whether the systems are working as intended. My internship project contributed to part of one such evaluation.
The Center has been working with the New Hampshire Department of Education to evaluate its school accountability system. The evaluation is guided by four research questions:
- Does the accountability system identify the “right” schools for federal support designations?
- To what extent are resources differentially distributed to identified or non-identified schools based on information from the accountability system, and are these resources adequate for establishing improvement programs?
- Does the accountability system produce useful information for improving schools?
- Do identified schools improve, and do they do so at a faster rate than non-identified schools?
I worked on research questions 1 and 4, but in this blog, I focus only on question 1. Reframed, this research question asks whether schools that are identified for support should be identified, and, conversely, whether schools that are not identified shouldn’t be.
One way to answer this question is by considering how consistently the accountability system identifies schools for Comprehensive Support and Improvement (CSI). A thought experiment can help us understand consistency. Imagine if we were to run the entire accountability process again (assuming we were blind to the first run); what is the probability that we would see the same schools identified?The Center for Assessment has a long history of helping states evaluate the reliability of their accountability systems. However, New Hampshire uses a profile approach for identifying schools, which differs from the approaches evaluated in previous Center research.
A Different Way of Identifying Schools for Support
A profile approach uses a set of indicators, such as growth or achievement, to make decisions about school performance. This differs from a more widely used approach, in which multiple indicators are combined into a single weighted index. Schools falling into the bottom 5 percent of this weighted index are then identified.
New Hampshire’s system, summarized in Table 1, uses four school-level indicators for identification. Each indicator is divided into levels, and interest holders have defined combinations of these levels into “steps.” Each step represents a unique level of performance. Schools in lower steps are considered the lowest performing schools, and the bottom 5 percent of schools, based on their step location, are identified for CSI.
Table 1
Step | Achievement | Growth | English Language Proficiency (ELP) | Equity |
---|---|---|---|---|
1 | Level 1 | Level 1 | Level 1 | Level 1 |
2 | Level 1 | Level 1 | Level 2 | Level 1 |
3 | Level 1 | Level 1 | Level 1 | Level 2 |
4 | Level 1 | Level 1 | Level 2 | Level 2 |
5 | Level 2 | Level 1 | Level 2 | Level 2 |
6 | Level 2 | Level 2 | Level 2 | Level 2 |
Profile-based systems offer advantages for states and their accountability systems, but little work has been done to evaluate these types of systems.
Building on the work of the Center in this area, I relied on simulations to quantify consistency. In the simulations, I created many replications for each school, where each replication is a simulated, alternative version of that school’s indicator scores.
Why simulation? It allows us to quantify consistency and model other indicator values a school might have. Instead of a level of 4 on the academic indicator, a school might have gotten a score of 3 or even 2 if its situation was slightly different. And, importantly, we also need to imagine that some schools are more likely to have these kinds of changes than others (e.g., schools with small numbers of students, schools with scores close to the indicator cutoffs). We can then simulate multiple replications and compare to quantify consistency.
Results of the Simulations
While simulating data for accountability systems based on an overall or composite score has been done, no one has developed and implemented methods for simulating data based on profile-based systems like the one in New Hampshire.
Using a parametric statistical model to simulate data in the New Hampshire system would require specific assumptions about the probability distributions of student and school measures. These assumptions may not be true given the nature of the present data, which would create issues in implementing these parametric simulations.
Because of this, I chose a non-parametric approach in which I repeatedly sampled students at random to create many sets of data (also known as bootstrapping). This method preserves the actual relationships between the measures at both the student and school levels while providing the necessary variation for comparisons among the different observations.
The results of our study provided a clear message.
- New Hampshire’s system was very consistent (about 98 percent in 2023) when examining non-identified schools—that is, non-identified schools are almost always not identified in the simulated data.
- The system was less consistent (about 67 percent in 2023) when examining identified schools—that is, identified schools were generally identified in the simulated data.
The latter result is due mainly to the 5 percent target. In 2018, the system identified only 12 schools for CSI. This is a very small target, where just one inconsistent school can drastically lower the consistency of CSI-identified schools. Conversely, when examining the other 95 percent of schools that are not identified, there is much more room for error, as you can miss one or two schools and still have highly consistent results.
These initial results are promising. An important component of an accountability evaluation is having confidence that schools identified for support and improvement are the “right” schools. Investigating the other three research questions will help the state understand how well the accountability system serves its intended purposes and supports school improvement.