Back
CenterLine

Validity

Through Year Assessment State Testing Validity Accountability Innovative Assessment Assessment

Trying to Serve Multiple Uses with Through Year Assessments

Assumptions, Claims, and the Evidence Needed to Support Them

Assumptions lead to claims about the types of inferences and uses that an assessment system is intended to support. Strong assumptions require strong evidence. This axiom holds true for educational measurement and most other scientific endeavors. The current interest in through year assessments appears to be based on very strong assumptions about the multiple purposes through year assessments may serve. Is there evidence to support these assumptions?

read more

School Disruption Remote Testing comparability Validity COVID-19 Response Assessment

Test Score Meaning Under Remote Test Administration

Part 2: Mode or Accommodation? A Framework for Thinking about Remote Administration

This is the second in a series of three posts on planning for the examination of the validity of scores collected through remote test administration. In the first post, Michelle Boyer and Leslie Keng laid out the reasons why states should be concerned about the effect of remote testing on the comparability of score meaning. In the third post in this series, we will discuss specific challenges to score interpretations for remotely-administered tests.  

read more

school disruptions Remote Testing comparability Validity Assessment COVID-19 Response

Test Score Meaning Under Remote Test Administration

Part 1: Why Validity is Threatened under Remote Administration Conditions

This is the first in a series of three posts on planning for the examination of the validity of scores collected through remote test administration. Next up is a discussion of a framework for the types of analyses that will be useful for understanding the degree to which scores are comparable between remotely tested students and students tested in the classroom, what might be done to adjust them if they are not, and the conditions under which data can be collected to support those analyses.

read more

Educational Assessment comparability Validity

Comparability of Scores on the Same Test

In 2018, the Center was honored to be invited by the National Academy of Education to contribute two chapters to the book Comparability of Large-Scale Educational Assessments, which was released earlier this year. The Center’s chapters addressed the foundational issues surrounding the comparability of individual and aggregate group scores when students ostensibly take the same test.

read more

Educational Assessment Validity Claims NGSS Assessment Use of Assessment Results

Recommendations to Support the Validity of Claims in NGSS Assessment - Part 2

Part 2: A New Framework for Organizing and Evaluating Claims

This is the second in a series of posts by our 2020 summer interns and their mentors based on their project and the assessment and accountability issues they addressed this summer. Sandy Student, from the University of Colorado Boulder, and Brian Gong get things started with a two-part series describing their work analyzing the validity arguments for states’ large-scale Next Generation Science Standards (NGSS) assessments.

read more

Educational Assessment Validity Claims NGSS Assessment Use of Assessment Results

Recommendations to Support the Validity of Claims in NGSS Assessment

Part 1: Designing Assessments to Support Claims

Once again this year, we are pleased to share posts on CenterLine by our summer interns and their mentors. These posts are based on the project they undertook and the assessment and accountability issues they addressed this summer.

read more

Educational Assessment Validity Reliability Fairness

2020 Summer Internships: A Little More Certainty in an Uncertain Future

Center Staff and Interns Will Address Pressing Issues in Educational Assessment and Accountability

Although so much about the future seems uncertain, we are excited this month to bring a little normalcy into our world by addressing key questions and challenges in educational assessment and accountability through our 2020 summer internship program. This summer, the Center welcomes four advanced doctoral students who will work with the Center’s professionals on projects that will have direct implications for state and national educational policy. Each intern will work with a Center mentor on one major project throughout the summer.

read more

next-generation assessment Educational Assessment Validity Accountability

The Next Generation of State Assessment and Accountability

Part 3: Changes to the Current Accountability Model That May Fundamentally Reshape Educational Assessment and Accountability

This is the final installment in a three-part series on the future of large-scale state assessment and accountability. Of course, it is impossible to know the future, but forecasts for educational assessment can be informed by examining what has shaped state assessment and accountability in the past. 

read more

Educational Assessment next-generation assessment Validity Assessment

Sizing Up the Next Generation of Large-Scale State Assessment and Accountability

Part 2: The Role of Education Theory, Public Support, and Political Policy in Shaping the Next Dominant Pattern in State Assessment and Accountability  

This is the second in a three-part series on the future of large-scale state assessment and accountability. Of course, it is impossible to know the future, but forecasts for educational assessment can be informed by examining what has shaped state assessment and accountability in the past.

read more

Educational Assessment Scoring Automated Scoring Validity Test Score Reliability Assessment

Understanding and Mitigating Rater Inaccuracies in Educational Assessment Scoring

Rater Monitoring with Inter-Rater Reliability may Not be Enough for Next-Generation Assessments

Testing experts know a lot about how to conduct scoring of students’ written responses to assessment items. Raters are trained under strict protocols to follow scoring rules accurately and consistently. To verify that raters did their job well, we use a few basic score quality measures that center on how well two or more raters agree. These measures of agreement are called inter-rater reliability (IRR) statistics, and they are widely used, perhaps in part because they are easy to understand and apply. 

read more