This is the fourth in a series of posts by our 2020 summer interns and their mentors based on their project and the assessment and accountability issues they addressed this summer. Maura O’Riordan, from the University of Massachusetts Amherst, worked with Chris Domaleski to develop guidance to help states better understand the impact of assessment accommodations on the meaning and interpretation of test scores.
When tests are administered under accommodated conditions, it is essential that the results can be interpreted and used as intended. For large-scale K-12 testing, that expectation (as addressed in critical element 5.3 of the United States Department of Education peer review guidance) requires states to provide evidence that the available accommodations allow for meaningful interpretations of the scores as compared to the scores when the assessment is administered under standard (unaccommodated) conditions.
This task is necessary but can be daunting in practice. To help address this problem of practice, we developed a framework to provide guidance to states on the selection, documentation, and use of accommodations on large-scale assessments. The framework is intended to bridge the gap between the best practices identified in the literature and the practicality of demonstrating evidence in support of the use of accommodations.
One important point we noted in the literature is that the accommodations allowed on an assessment depend in large part on the construct being measured. For example, allowing the use of a calculator on an assessment that is intended to measure computational skills would alter the construct being measured. Evidence in support of accommodation use needs to relate specifically to the construct being measured by the assessment of interest and the students being assessed. Also, accommodations vary widely and research in the area is often specific to grade levels or disability types.
Typically, accommodations are provided to two groups of students – students with disabilities and English learners. The students in these two groups have varying needs and no one-size-fits-all accommodation plan exists for them. Because of this diversity, it is important that the claims made regarding the appropriateness of accommodations used be considered for all students taking the assessment with an accommodation, as well as for all accommodations or types of accommodations. Additionally, the interaction between the specific needs of the student and the specific accommodation(s) they are allowed must be considered.
Description of The Framework
Our framework incorporates themes that we saw as prevalent in our review of the literature and allows for the evidence to be customized based on what an assessment is intended to measure. There are three sections of the framework – implementation, impact and interpretation. Spanning the three sections is a fourth idea, interaction, which is intended to demonstrate that each of the three sections must address all appropriate examinee groups and accommodations, and as necessary, the interaction among those groups and accommodations.
A visual representation of the framework is presented in Figure 1. The claims we provide as examples are not exhaustive, and the evidence given to support each claim is illustrative. We acknowledge there is more than one way to support some claims.
Implementation encompasses the broad idea that appropriate accommodations are available and are used as intended.
- One supporting claim specifies there is a method for determining which accommodations are allowable based on the construct the test is intended to measure, and which minimizes unrelated factors (Elliot et. al. 2002; Laitusis et. al, 2012; Philips, 1994). Possible evidence to demonstrate this claim may come from a description of the construct being measured and a rationale for why the accommodations do not alter that construct.
- Another supporting claim clarifies that appropriate accommodations are available and implemented as intended. Resources such as written testing protocols or training materials can help support this claim. Moreover, analyses of type and frequency of accommodation for appropriate examinee groups, feedback from test administrators, and reports from assessment monitoring can further bolster the quality of evidence.
The second section of the framework addresses impact. This section examines the relationship between the accommodations that were selected and outcomes for examinees.
- For example, a key claim is that the accommodations remove barriers for examinees to demonstrate their knowledge, skills, and abilities. This claim could be supported via expert judgment or cognitive labs.
- A second claim is that accommodations do not alter the construct being assessed. Differential boost studies, discussed by Sireci et al. (2005), or a comparison of data from field tests for examinees with and without accommodations can provide supporting evidence.
- Still another claim specifies that accommodations should not alter the difficulty of the content, as discussed by Faulkner-Bond and Soland (2020). This claim could be demonstrated via DIF studies or other approaches.
The third section of framework addresses interpretation. Interpretation refers to the use of the scores for their intended purpose.
- A central claim in support of interpretation is that the measures obtained under accommodated conditions elicit the target knowledge and skills. This outcome can be demonstrated through review of items or task specifications.
- A second claim is that scores for examinees using accommodations are sufficiently precise, which can be addressed through measures of reliability and standard error of measurement, as well as studies of classification accuracy.
- In most cases, it’s also important to support the claim that scores for examinees using accommodations can be meaningfully compared to those from examinees not receiving accommodations. Approaches for providing supporting evidence may include differential item functioning (DIF) studies for accommodated/non-accommodated examinees or a comparison of measurement invariance.
- Sireci and O’Riordan (2020) discuss the idea that it may be more relevant to demonstrate that the intended score inferences are precise or supported than to focus on the scores themselves, which relates to a fourth claim that applicable score inferences for examinees using accommodations are supported.
The fourth section of the framework is termed interaction. This section addresses the necessity for the claims and evidence from the previous three sections (implementation, impact, and interpretation) to address all appropriate examinee groups and the interactions among those groups as necessary. Additionally, consideration of the different accommodations provided is important for many of these claims. The evidence may require a more detailed look than just accommodated versus unaccommodated by delving into the specific allowable accommodations on that assessment for specific groups or conditions.
We hope that the process of providing evidence in support of the use of accommodations on large scale assessments will be a less daunting task for states with the use of this framework. By identifying the claims that are being made about the use of accommodations on specific tests, we think the necessary evidence will be clearer and will allow for stronger arguments to be made in support of the use of the selected accommodations, thus supporting the validity arguments of the test more clearly.