Back
CenterLine

Test Score Reliability

Educational Assessment Scoring Automated Scoring Validity Test Score Reliability Assessment

Understanding and Mitigating Rater Inaccuracies in Educational Assessment Scoring

Rater Monitoring with Inter-Rater Reliability may Not be Enough for Next-Generation Assessments

Testing experts know a lot about how to conduct scoring of students’ written responses to assessment items. Raters are trained under strict protocols to follow scoring rules accurately and consistently. To verify that raters did their job well, we use a few basic score quality measures that center on how well two or more raters agree. These measures of agreement are called inter-rater reliability (IRR) statistics, and they are widely used, perhaps in part because they are easy to understand and apply. 

read more