A pencil is suspended in mid-air being held by red tape to represent the constraints of the federal government on education policy.

Building Innovative Assessments in an Era of Accountability

How Federal Rules ‘Suffocated’ Georgia’s Testing Pilot

In 2018, the Georgia legislature established a pilot program that allowed districts to develop innovative assessment systems. The purpose of this legislation was to spur assessment improvement and empower local education leaders to create assessments that best meet the needs of their communities, rather than continuing to rely on a single, statewide assessment. 

To pursue federal flexibility for the state pilot, Georgia sought and won permission to participate in the U.S. Department of Education’s Innovative Assessment Demonstration Authority (IADA). That program allowed two consortia of Georgia districts to develop and pilot through-year assessments.

After five years of participating in the IADA, Georgia withdrew in February. Our journey showcases some key difficulties on the road to creating new, innovative assessments. I’d like to share some of the lessons we learned and advocate for changes in federal assessment policy that can better support teaching, learning, and assessment innovation.

Federal Rules Hobbled Assessment Innovation

When Georgia withdrew from the federal pilot, Georgia’s School Superintendent Richard Woods said that the district consortia’s “innovation has been suffocated by rigid federal requirements—namely, the requirement in federal law to use these assessments for high-stakes accountability purposes. True assessment innovation cannot occur if the underlying uses of the test are rooted in hyper-accountability.”

Marianne Perie, Georgia’s IADA technical assistance provider, also noted federal restrictions, writing in a blog post that “the intent [of IADA] was promising, as it gave states a chance to try out new innovations, such as performance tasks, student portfolios, and through-year assessments. Unfortunately, the restrictions that still existed under ESSA hindered states’ abilities to push innovation very far.”

These sentiments accurately summarize our primary takeaway from five years of pursuing assessment innovation in Georgia. Despite the consortia’s admirable ideals and their goals to develop instructionally focused assessments, existing federal laws and regulations—particularly those pertaining to the accountability uses of the assessments—constrained their work and ultimately led us to leave the pilot.

We Can’t Use One Test for Both Accountability and Instruction

Two purposes of assessment dominate current education policy conversations: accountability and instruction. Many design, technical, and administrative decisions must be made with each purpose in mind, while also meeting practical requirements for student testing time and financial costs. It is challenging enough to develop a test that is technically sound for one of those purposes, let alone for two very different purposes. A test’s summative, high-stakes accountability uses will almost always conflict with the formative goals of tests used to support classroom instruction.

As Center President Scott Marion wrote in a recent blog post, “We have a long history of trying to combine instruction and accountability, and accountability always wins.” Given the lack of evidence—or even a logical theory of action—to support the possibility that assessments could serve both purposes, I remain unconvinced that it is feasible for one assessment to do so. (For more on this, see several Center resources on through-year assessment: Its recent white paper, its virtual convening, and this blog post I wrote with Scott.)

We Need a System of Assessments

While a single assessment cannot serve both accountability and instructional purposes, a system of assessments can. A cohesive, balanced system of assessments can provide both formative information to support teachers’ instruction in real time, and an end-of-year summative component to support state and federal accountability. 

Each component can be designed to support the design, technical, and administrative decisions best suited to its purpose by the appropriate users. By protecting assessments intended to support instruction from accountability use, we increase the likelihood they can actually be used that way. We also keep high-stakes summative assessments in their appropriate lane: as a tool to monitor and support schools without being a constant presence in classrooms. 

Assessment Innovation Will Falter Without Policy Innovation

With advances in technology and infrastructure, as well as new test development and psychometric methods, we can and should innovate to improve our assessments. But many people point to the rigidity and limitations of current assessments as the problem, ignoring the real cause: accountability requirements. While assessment innovations are occurring, large-scale innovations will be stifled without innovations in policy.

Federal testing requirements should focus on accountability and serve four main purposes: 1) provide a high-quality, annual statewide measure of student achievement of the state’s academic content standards, with consistent standards of achievement and proficiency across years; 2) measure the achievement of student groups to ensure equitable access to a high-quality education; 3) measure school and district performance to support improvement efforts; and 4) support the determination of statewide policy and funding priorities. Federal policy can also ensure, through its peer-review process, that the assessments states develop and administer to fulfill such high-stakes purposes are high-quality, technically sound, and can, in fact, serve their intended purpose.

All the additional requirements currently in federal law, however, are not needed to serve these purposes. Modifying federal law to focus on assessment for accountability determinations at group levels can ensure the technical quality of those decisions while leaving student-level feedback to instructionally focused formative assessments. 

Streamlining federal law in this way would also allow for shortened, less intrusive accountability tests, which would free up time and space for formative assessments that fulfill instructional purposes in a low-stakes setting. Protecting formative assessment from accountability requirements also allows these assessments to meet technical quality requirements appropriate for their innovative design, without the restriction of the technical quality requirements needed for high-stakes summative tests.

We need to shift some of the excessive focus on accountability to instruction, and change federal assessment policy to better support student learning. I offer one comprehensive solution and one partial solution.

  1. Modify the requirements that stifle assessment innovation and restrict instructional creativity.

Reconsider requirements to test all students every year on the full depth and breadth of the state content standards. Consider reducing the scale of requirements for current assessments (for example, subscore reporting). We could require testing in fewer grades and/or allow content sampling approaches (such as a NAEP-like model). 

Assessments for accountability need not waste time and resources trying to simultaneously provide instructionally focused information. By reducing the footprint of high-stakes testing, we free up instructional time and remove some of the accountability-driven pressure that has negatively impacted teaching and learning (test prep time, narrowing of curriculum, teacher burnout and turnover, etc.). Local assessments can then be used to support through-year instructional goals.

  1. Expand the IADA to include accountability.

Let states innovate not just the assessment instruments themselves, but also how and when we assess students and how we hold ourselves accountable. An innovative assessment requires an innovative accountability solution, and without this, innovative tests will not find a place within a balanced system of assessments. Such a solution would provide an opportunity to explore multiple innovative approaches across states before implementing federal assessment policy reform.

Georgia’s innovative assessment consortia were on the right track. They wanted to develop instructionally focused assessments that their teachers supported, understood, and could use to inform their daily work with students. After all, that is the goal of educational assessment: to inform teaching and learning. It’s time to rethink federal policy so that we can better meet both assessment purposes: accountability and instruction.