One of my most salient “aha” moments in teaching and assessing occurred a few years back while grading my students’ portfolios. This work is submitted in three stages, scaffolded throughout the semester, and demonstrates my students’ developing mastery to plan, instruct, evaluate, and reflectively consider their work in the classroom. It is a major undertaking for my students to write their portfolios and equally complicated to evaluate and score them in a valid and reliable way. On this occasion, students had submitted the first part of their portfolios and received the scored rubric with feedback. Part two was coming up and students were encouraged to revise part one for partial credit along with their submission towards part two. Everything was going along splendidly until the second submission of the portfolios. Despite the feedback students received for part one, there were no marked improvements in their work at the submission of part two.
I was dumbfounded. How could this be? I provided a range of scores for each category of work: ‘below’ (1), ‘approaching’ (2), ‘meeting’ (3), and ‘exceeding’ (4) expectations. As a class, we reviewed the expectations for their work. Writing guides were provided to specifically indicate expectations during the writing process. And ample time was provided between the receipt of feedback and opportunity for resubmission. Yet their work was simply not improving between the first and second submission. I was baffled and frustrated until it hit me: the rubric did not provide the right type of feedback.
Research indicates that when properly constructed, rubrics provide valuable feedback, engaging students in critical thinking, and offering support for developing writers.
The descriptors are the vertical categories or attributes of student work along the left hand column of the rubric that delineate each of the major expectations, while the degrees are the horizontal columns along the top that distinguish between performance at each level. For example, in a paper about child development, a descriptor might state: “Summary of the observed child.” It is the degree that describes what that summary should include to merit a score of below, ‘approaching,’ ‘meets,’ or ‘exceeds’ expectation. For example, a ‘meets expectation’ score may state: “The description of the observed child includes educational setting and description of child’s developmental status in a clear, detailed manner, free of interpretation or jargon, rich in relevant details, and consistent with the key points in the paper.” In this degree, my student can see clearly the requirements for an acceptable presentation of the summary of the observed student. So what was the missing link between my students’ work and their idle scores?
In my case, the degrees were not detailed enough to offer suggestions for improvement from one submission to the next. Specifically, while students knew they had to provide a summary of the observed student (descriptor), they were confused about the level of detail required for a ‘meets expectation’ (3) and an ‘exceeds expectation’ (4). Herein lies the potential of the rubric that is beyond a simple rating scale (1, 2, 3, or 4).
Based on student feedback, I was able to revise my rubric to provide qualitatively and quantitatively rich feedback to distinguish between the performance at each level of work, which significantly improved my students’ next submission.
Before I began using rubrics extensively, I relied upon my own expertise to grade students’ work, providing comments in the margins for areas that could use additional support, or pointing out flaws in arguments. What I found was not simply a great deal of bias – I’m looking at you halo effect – but inconsistency in scores themselves, which instantly affects the reliability of scoring (Nisbett & Wilson, 1977). Through the use of rubrics, I can measure how well my students are mastering course concepts in a valid and reliable way. And while the rubrics continue to evolve, their connection to course outcomes and ability to provide feedback to my students remains the focal point.
Below are a few of the key tenets I use for creating valid and reliable rubrics. I add to this list regularly but these are some of the basics:
- Meaningful. The concepts you are evaluating are meaningful and specific. If you want students to report about the physical description of the schools in which they are observing students, say so!
- Format Specific. If it’s a paper, don’t forget to add a descriptor for grammar. If it’s a presentation, don’t forget add a section for proper presentation etiquette (e.g., eye contact, pacing, logical progression).
- Quantity. For example, how many pieces of evidence must be cited between a paper that is a 3 (‘meets expectation’) and a 4 (‘exceeds expectation’)?
- Quality. Specifically, how rich and detailed must each piece of evidence be in order to distinguish between the aforementioned 3 and 4?
After many years of teaching the same class, even if the assignments change dramatically, you have collected an arsenal of outstanding student work. I always share exemplary work with my students once they have completed their initial planning, such that their goals and objectives are their own and the exceptional work of their colleagues is viewed for conceptual understanding.
These are just a few of the many considerations I make when writing rubrics in an effort to improve validity, or ability to meaningfully assess the objective for each assignment. But every domain and every piece of work is different and I’m curious:
- In what ways do you provide feedback for your students?
- What suggestions do you have for writing better scoring mechanisms?
- What assessments work in your classroom that you would suggest giving a try?
Andrade, H. G. (2000). Using rubrics to promote thinking and learning. Educational Leadership, 57(5), 13-19.
Mertler, C. A. (2001). Designing scoring rubrics for your classroom. Practical Assessment, Research & Evaluation, 7(25), 1-10.
Nisbett, R. E., & Wilson, T. D. (1977). The halo effect: Evidence for unconscious alteration of judgments. Journal of personality and social psychology, 35(4), 250.