Rethinking Assessment on a Shrinking, Shifting Landscape

Key changes are making it harder to tell the story of student learning

Every few years, we hear calls to “future-proof” our education systems (in the ways we develop assessment systems, use AI in assessment, or train teachers, for instance). But we don’t need a future-proof system. We need a system that’s honest about the present—one that acknowledges the demographic shifts, structural tensions and data blind spots that already exist. Only then can we responsibly build toward tomorrow.

This blog is a reflection on the session I presented at the 2025 National Conference on Student Assessment. I focused on how assessment system design challenges, emerging technologies and changes in student enrollment are colliding in ways that force us to rethink not just our metrics, but our values.

The System Is Shrinking—and Fragmenting

Let’s start with the numbers. Public school enrollment is declining in most states, a trend driven by sustained drops in birth rates (CDC, 2023) and an aging population (Census Bureau, 2020). The National Center for Education Statistics projects sustained decreases through 2031, nearing 4%, with rural states in the Northeast and Midwest hit hardest.

But that’s only part of the story. More students are disappearing from traditional accountability and/or assessment systems because private school choice options and microschools are expanding and charter school attendance is growing.

Our state data systems weren’t built for this. They rely on traditional public K-12 structures: consistent grade-level enrollment, annual test participation, and stable student-school assignments used for accountability and resource distribution. But those assumptions are being challenged. And when they break down, our ability to tell the story of student learning and system performance breaks down.

We’re Losing Data in the Name of Flexibility

Test participation is dropping. Federal oversight is thinning. Many new instructional models (virtual academies, microschools) operate outside of state purview. Meanwhile, states face increasing demands to develop more instructionally aligned, efficient, and accessible assessments—often branded as “innovations”—that respond to both federal policy expectations and local usability needs. And they’re being asked to do so often with fewer resources and less centralized authority.

All of these dynamics have produced a kind of shadow over the system; key parts of the student experience are falling outside the light of our current measurement systems. The result? Fragmented feedback loops, lost trend lines, and a growing disconnection between what we say we value and what we actually measure.

Innovation Without Coherence Is Just Noise

Contextually appropriate tools meet the moment if they’re integrated into the broader system with purpose and integrity. States are embracing through-year assessments and interim-based designs with the hope of better aligning instruction and assessment. That’s admirable. But many of these systems were conceived in a world where test participation was stable, enrollment predictable, and comparability achievable. The challenge arises when we retrofit these assessments for purposes they weren’t built to serve — overlooking the intent, constraints, and validity arguments of their original design.

To be clear: traditional systems face these same pressures. But what makes new systems particularly vulnerable is the assumption that their innovative nature shields them from those same structural challenges. If anything, the risk is greater. I would argue that we haven’t outgrown traditional systems so much as we’ve outpaced the conditions that allowed them—and their replacements—to function as intended. Innovation without coherence doesn’t correct the system; it just decorates it.

It’s a bit like hanging something heavy on drywall without finding the studs: That big TV might look okay at first, but without the right supports, the setup will soon collapse. Repurposing assessments without revisiting their foundational assumptions can produce similarly shaky results. The surface alignment looks good, but the structure underneath can’t support the weight of real decisions.

Now, throw in generative AI. While it offers promise for item creation, scoring support, and individualized feedback, it also risks embedding bias, obscuring decision-making, and further distancing practitioners from the heart of measurement: interpretation.

If we’re not careful, we’ll build faster systems that tell us less. Worse, we may inadvertently reify outdated ideas about fairness and success through more sophisticated tools.

The “Who” is Changing. The “What” and “Why” Must Too

Our youngest learners are increasingly multilingual, mobile (often moving between districts or schools due to housing instability, immigration, or family employment patterns), and shaped by diverse educational pathways. According to the most recent data from the WIDA Consortium (as presented at the National Conference on Student Assessment, 2025), the multilingual population continues to grow among early learners, with nearly a quarter of all students projected to be multilingual learners. This growth challenges everything from item accessibility to subgroup reporting to growth model stability.

We must revisit the business rules that underpin our systems, particularly full-academic-year definitions, which typically require enrollment for most of the school year, which tend not to accommodate mid-year transfers, hybrid attendance, or students in alternative learning models. Likewise, subgroup minimums and participation thresholds weren’t written for a system in motion.

Instead, we might advocate for more flexible enrollment criteria that reflect instructional exposure rather than seat time. For example, defining participation eligibility by a minimum number of instructional days rather than fixed calendar dates. Likewise, subgroup reporting rules could shift toward rolling multi-year aggregates to ensure visibility without sacrificing reliability.

Some states already do this. And as mobility increases, portable student records and common growth metrics, such as shared growth scales, could help smooth the attribution of learning across schools and systems.

More fundamentally, we must ask: Are we measuring what matters to our communities? Or are we just optimizing for what’s easy to quantify?

A Civic Approach to Measurement

This is where the work of the NCME Civil Rights in Measurement Task Force resonates deeply. It reminds us that measurement isn’t just a technical act—it’s a civic one. It reveals who we see, what we value, and which stories we choose to tell (or not).

I raised these questions in my presentation, and they have stuck with me as something we must consider as our assessment and accountability systems evolve. They are both present- and future-focused questions:

Who’s missing from our data?
How do we design assessments that recognize community and cultural differences?
What assumptions must we release to move forward with integrity?

These questions aren’t philosophical. They’re practical and force us to articulate our guiding principles (see my blog) and operationalize design criteria.

Building Systems That Recognize Change

If you were building a new assessment system today, knowing your student population will be significantly smaller, 40% more linguistically diverse, and increasingly outside the public system, would it look the same as the one you have now?

If the answer is no, it might be time to stop waiting for the future. We need to start designing for it.

Photo by Allison Shelley/The Verbatim Agency for EDUimages