Peter Yeates, M.B.B.S., M.Clin.Ed., of the University of Manchester, United Kingdom, and colleagues conducted a study to examine whether observations of the performance of postgraduate year 1 physicians influence raters' scores of subsequent performances.
"The usefulness of performance assessments within medical education is limited by high interrater score variability, which neither rater training nor changes in scale format have successfully ameliorated. Several factors may explain raters' score variability, including a tendency of raters to make assessments by comparing against other recently viewed learners, rather than by using an absolute standard of competence. This has the potential to result in biased judgments," according to background information in the article.
The study consisted of an internet-based randomized experiment using videos of Mini Clinical Evaluation Exercise (Mini-CEX) assessments of postgraduate year 1 trainees interviewing new internal medicine patients. Participants were 41 attending physicians from England and Wales experienced with the Mini-CEX, with 20 watching and scoring 3 good trainee performances and 21 watching and scoring 3 poor performances. All then watched and scored the same 3 borderline video performances. The study was completed between July and November 2011.
The researchers found that attending physicians exposed to videos of good medical trainee performances rated subsequent borderline performances lower than those who had been exposed to poor performances, consistent with a contrast bias. The implication is that a rater of a trainee's performance may be unconsciously influenced by the previous trainee, rather than objectively assessing the individual in isolation.
"With the movement toward competency-based models of education, assessment has largely shifted to a system that relies on judgments of performance compared with a fixed standard at which competence is achieved (criterion referencing). Although this makes conceptual sense (with its inherent ability to reassure both the profession and the public that an acceptable standard has been reached), the findings in this study, which are consistent with contrast bias, suggest that raters may not be capable of reliably judging in this way."