Sizing Up the Next Generation of Large-Scale State Assessment and Accountability

Feb 27, 2020

Part 2: The Role of Education Theory, Public Support, and Political Policy in Shaping the Next Dominant Pattern in State Assessment and Accountability

This is the second in a three-part series on the future of large-scale state assessment and accountability. Of course, it is impossible to know the future, but forecasts for educational assessment can be informed by examining what has shaped state assessment and accountability in the past. In Part 1, I looked at the role played by emerging operational capacities and the desire for efficiency – specifically computer-based assessment.  In this post, I will explore changes in state assessment and accountability bubbling up from changes in educational theory, public values, and political policy.

From around 2010 to now, we have witnessed the shift of almost every state assessment program from paper-based to computer-based testing, with many states also incorporating computer-adaptive testing. I have characterized this transition from paper-based to computer-based testing as being largely concerned with efficiency; computer-based testing has primarily enabled exactly what paper-based testing has, but with more efficiency, flexibility, exactness, and perhaps with less expense. 

That shift in testing brought about a dramatic change in the companies delivering state assessments.

The Constants Around Paper-Based State Testing

Despite the shift to computer-based testing, much about paper-based state testing programs has remained largely unchanged since the 1990s, such as the types of constructs measured, how they are measured, and, importantly, why they are measured and how the results are used. 

Those measurements have helped with evaluating two key aspects of student learning:

  • Assessing  students against state content and performance standards at the end of the year,
  • Providing declarations of student proficiency for incorporation into state test-based accountability systems.  

Those accountability systems, in turn, are designed to meet specific goals:

  • an underlying value of equity (“No child left behind”),
  • a theory of action regarding that equity being brought about by the combination of common content standards, standards-based assessment, and
  • school accountability.

My argument is that the next generation of state assessment and accountability will not be directed purely by a desire for increased efficiency, which in part brought about the shift to computer-based assessment. Nor will it be fueled solely or even primarily by additional technological capacities in educational measurement, such as those forecast by Bunderson et al. I think the future of state assessment and accountability will be shaped primarily by shifts in the underlying values and theories of action of states, educators, and policymakers.

You may argue that new item types enabled by computer-based testing and increased rigor brought about by the Common Core State Standards and other college-readiness standards represent fundamental changes to state assessment programs. I view those changes, however, as incremental changes within the current paradigm driving state assessment. An example of a fundamental shift in the focus of state assessment is the transition to the standards movement in the late 1980s and 1990s that was the beginning point in my previous post.

Lake Wobegon and The Transition to Standards-Based State Assessment

The first embodiment of modern state assessment was commercial, norm-referenced tests (NRT). Following the passage of the Elementary and Secondary School Act (ESEA) in 1965, states (as contrasted with schools or districts) used NRTs to meet the requirements for Title 1 program evaluation mandated by the federal government in 1974. Three companies dominated the NRT market: Harcourt, Riverside, and CTB/McGraw-Hill.

The standards-based educational reform movement, started in the 1980s as states moved away from norm-referenced tests and toward common content standards and student expectations. Individual states began moving to standards-based content frameworks and assessments, and several states were implementing test-based state accountability systems by the 1990s.  

Wide-spread debate on norm-referenced tests was stimulated by the publication of J.J. Cannell’s “Lake Wobegon” articles, in which Cannell pointed out that across the nation, most elementary students were reported as having scored above the 50th percentile on NRTs; that is, as in the fictional Lake Wobegon, all the children are above average.  

The move to standards-based assessments for states was solidified by the passage of the Improving America’s Schools Act (IASA, 1994), which required all states to implement an assessment based on the state’s content standards in reading/language arts and mathematics. Since each state had its own content standards—as well as ideas about what the test should look like—custom assessments had to be developed to meet each state’s demands.  IASA included other requirements that were difficult for companies specializing in NRTs to meet, at least based on their historical practice, such as being appropriate for all students and providing accommodations.  

Assessment providers had to be profitable in the development of custom assessments, many of which involved scoring essays, constructed response math items, and even performance events and portfolios. Meeting IASA requirements also introduced other assessment needs that were quite different from those that NRTs required and fulfilled: 

  • secure administration protocols
  • assessment of all students, including those who required accommodations or alternate assessments
  • different psychometric procedures inherent in standards-based as contrasted with norms-based assessments (including the development of new test forms annually and the equating of test forms from year to year)

In addition to the assessment demands, states and their testing contractors had to be able to comply with federal accountability requirements, including complex tracking and reporting of each student.

Like the shift to computer-based assessment in the 2010s, the shift to standards-based, custom assessment had a profound impact on testing companies. Riverside did not market products or services to meet the new market. Harcourt and CTB/McGraw-Hill pivoted to provide custom assessments (while still marketing NRTs). Several new companies geared toward the custom state assessment market arose to prominence: Advanced Systems (later to change its name to Measured Progress), DRC, Questar, ETS,  and Measurement Incorporated. Harcourt was acquired by Pearson (2008).  State assessments through 2010 were dominated by these six companies. As discussed in the prior article, the movement to state assessments administered by computer hailed the decline of some of these companies and the rise of several others.

The Next Generation of State Assessment is Upon Us

As the 2020s begin, we are on the cusp of another fundamental shift in state assessment; a shift supported by technological advances but driven by education theory, public support, and political policy. 

A handful of states are taking advantage of the Innovative Assessment Demonstration Authority option under the Every Student Succeeds Act to test a variety of locally-centered approaches to state assessment. 

  • New Hampshire is exploring an assessment aligned with a competency-based approach to education, 
  • Louisiana is piloting an assessment based on texts read by students as part of the local curriculum, and
  • Georgia is attempting to implement a through-course model of state assessment.

New assessments being developed to measure student achievement of the Next Generation Science Standards are redefining what we mean by an assessment item and re-introducing matrix-sampling to state assessment.  Interest in performance assessments, which waned in the 1990s, is beginning to grow again. In addition to those changes in assessment, there is also a desire to expand state accountability beyond the longstanding focus on test-based performance in English language arts and mathematics.

In Part 1, I showed how advances in technology and placing value on doing the same things more efficiently led to a large change in the landscape of what state assessment looked like and which companies were helping states make these changes in the 2010s. 

But in this Part 2 I have tried to argue that the change to standards-based assessment from norm-referenced tests in the 1990s was a result of several other factors: 

  • popular educational theory (standards-based education reform; educational policy), 
  • federal government requirements for state assessment and accountability systems, and 
  • a theory of action that assessment best served those ends by state summative assessment (in a few content areas) to identify schools that could then receive state support to improve, which would systemically improve equity for economically disadvantaged students and students in other lower-performing groups. 

These factors—educational theory, public support, educational policy, and theories of actions (especially for accountability)—remain potent forces that are fueling change,  and these factors, supported by continuing advances in technology, will likely shape the future of state assessment and accountability as they have the past.  

That future is the topic of Part 3, so stay tuned!

Note: I gratefully acknowledge those who have helped shape my thinking on this topic, chiefly my colleagues at the Center for Assessment, especially Charlie DePascale, with whom I’ve shared many hours of engaging conversation. However, the opinions expressed are mine and are not intended to represent the Center or my associates.