What I Learned About Creating Effective Test Score Reports From the Great Ron Hambleton
Sadly, one of the giants in the field of educational measurement has recently passed away – Dr. Ron Hambleton. The fact that Ron was one of the most prodigious and acclaimed scholars in our field is undisputed. A list of his works is astonishing in its breadth and influence, and his accolades are unmatched. Despite this, he wasn’t a distant ‘ivory tower’ professor. Communicating in ways that reached broad audiences was one of his passions – one that came through in his works, including books that an entire generation of measurement students studied. I was one of those students.
As my career in measurement progressed, I had a chance to meet Ron. Later, I had the honor of serving with him on state technical advisory committees. I was always more than a little starstruck participating in meetings with him.
Given his natural gift for communication, it should come as no surprise that Ron began to focus more and more of his energy on helping states and programs improve score reports. He often lamented that score reports were the first thing stakeholders cared about but often the last thing measurement folks thought about. “Lead with the reports!” he urged.
As an homage to one of the measurement greats, and because Ron’s advice is as important now as it ever was, I offer a few lessons I learned from Ron over the years about creating effective score reports Although Ron’s advice extended to a wide range of reports for various purposes and audiences, here I will focus on individual score reports intended for audiences such as teachers or parents.
Four Best Practices for Creating Effective Score Reports:
Say What you Mean
An effective score report should include very clear statements of the intended interpretations. Be sure to provide that message in clear and explicit language rather than filling the page with data and hoping that users will understand the intended message.
Good score reports should include prominent text that provides the most important takeaway. At the overall level, that may include a performance level (e.g., proficient, advanced) and a clear description of the knowledge, skills, and abilities that provide meaning to that classification.
It’s tricker at the subscore level. Subscores are the source of much debate among measurement wonks – a topic that has been addressed in other CenterLine posts. There are different points of view about which among multiple approaches to use or whether to report subscores at all. But whatever the approach, the principle remains the same: state the intended interpretation clearly. Do the results suggest the student is not meeting expectations for fractions, but above expectations on modeling? Say that. Perhaps the outcomes indicate that the student is comparatively stronger with literary compared to informational text? State it plainly. Whatever interpretation can be supported, provide a succinct, uncomplicated description.
After all, there is no inherent meaning in three-digit numbers.
Be Honest About Uncertainty
Every reported test score is an estimate that contains error. Score reports should communicate that error plainly and clearly. This principle is well-grounded in professional standards, such as the Standards for Educational and Psychological Testing (2014), but applying that principle well is another matter.
A common approach is to display error bars around the reported score, but users may not always know what that means and how to interpret it. Another approach is to bury some obscure text about measurement error in a footnote or even a separate document, such as a score interpretation guide. I can still hear Ron objecting to that practice!
As a corollary to the previous point, reports should contain clear language about the impact of the error on interpretation. For example, if the error associated with a subscore is so large that the most defensible interpretation is that we don’t know if the student has met expectations for that skill or not, the report should communicate that uncertainty. There’s nothing wrong with adding a category such as ‘uncertain’ to indicate that any other interpretation is not defensible.
Create Visually Appealing and Informative Reports
It’s been rewarding to see how advances in paper and digital reports have improved practice. I can remember when score reports featured dot matrix printing and graphs that were created with text characters. We’ve come a long way.
In TAC meetings, Ron would be quick to praise a report that deployed great visuals combined with clear text to enhance the message. Great visuals aren’t too fancy and are designed to deliver a specific message. They also always contain clear titles and labels. Separate graphs may be preferred over a single cluttered graph.
Not only can visuals help clarify interpretations, but they can also streamline the report and make it more appealing to users. If users find a dull, text-heavy document off-putting, they are likely less inclined to give it much attention.
The shift from paper- to web-based reporting paves the way for a promising future for effective test score reporting, including the possibility of digital displays that offer interactive or dynamic features, especially for summary reports. Technology is a great vehicle to enhance reporting and Ron was always eager to embrace its potential to improve practice.
Conduct a User Review
Last, but certainly not least, Ron regularly urged states to have actual test users review the reports before they were finalized. Nothing is a suitable substitute for having teachers, parents, or other key stakeholders examine draft reports and provide feedback to inform the final product.
Gathering feedback on reports isn’t a wish-list activity. The right prompt for a teacher or parent isn’t, “Tell me everything you’d like to see on this report.” That approach will inevitably lead to a cluttered report with a lot of information, some of which is difficult to meaningfully interpret. A better prompt is, “What inference do you draw from this report?”, and, “What information did you use to make that interpretation?” In this way, user reviews are more like cognitive labs – they provide space for users to think-aloud about what inferences they are making and how they are making them. By so doing, states can better evaluate if the reports are helping teachers and/or parents to help students.
My focus on Ron’s advice about reporting shouldn’t diminish the immense range and depth of his work over his celebrated career. He could hold forth on intensely technical topics, offer incisive commentary about policy matters, or do deep dives on operational practices. Frankly, I can’t think of any area his expertise didn’t reach. But for the purpose of this post, I chose to highlight his advice about reporting because I think it sheds light on his unwavering optimism that tests should be a useful tool to help improve outcomes for students.
I’ll miss you, Ron, and every time the topic of reporting comes up in TAC meetings, I’ll try to channel your wisdom.