Reliability, validity, and feasibility of the Zwisch scale for the assessment of intraoperative performance.

George BC, Teitelbaum EN, Meyerson SL, Schuller MCDaRosa DA, Petrusa ER, Petito LC, Fryer JP.  J Surg Educ. 2014 Nov-Dec;71(6):90-96.





The existing methods for evaluating resident operative performance interrupt the workflow of the attending physician, are resource intensive, and are often completed well after the end of the procedure in question. These limitations lead to low faculty compliance and potential significant recall bias. In this study, we deployed a smartphone-based system, the Procedural Autonomy and Supervisions System, to facilitateassessment of resident performance according to the Zwisch scale with minimal workflow disruption. We aimed to demonstrate that this is a reliable, valid, and feasible method of measuring resident operative autonomy.


Before implementation, general surgery residents and faculty underwent frame-of-reference training to the Zwisch scale. Immediately after any operation in which a resident participated, the system automatically sent a text message prompting the attending physician to rate the resident's level of operative autonomy according to the 4-level Zwisch scale. Of these procedures, 8 were videotaped and independently rated by 2 additional surgeons. The Zwisch ratings of the 3 raters were compared using an intraclass correlation coefficient. Videotaped procedures were also scored using 2 alternative operating room (OR) performance assessment instruments (Operative Performance Rating System and Ottawa Surgical Competency OR Evaluation), against which the item correlations were calculated.


Between December 2012 and June 2013, 27 faculty used the smartphone system to complete 1490 operative performance assessments on 31 residents. During this period, faculty completed evaluations for 92% of all operations performed with general surgery residents. The Zwischscores were shown to correlate with postgraduate year (PGY) levels based on sequential pairwise chi-squared tests: PGY 1 vs PGY 2 (χ(2) = 106.9, df = 3, p < 0.001); PGY 2 vs PGY 3 (χ(2) = 22.2, df = 3, p < 0.001); and PGY 3 vs PGY 4 (χ(2) = 56.4, df = 3, p < 0.001). Comparison of PGY 4 to PGY 5 scores were not significantly different (χ(2) = 4.5, df = 3, p = 0.21). For the 8 operations reviewed for interrater reliability, the intraclass correlation coefficient was 0.90 (95% CI: 0.72-0.98, p < 0.01). Correlation of Procedural Autonomy and Supervisions System ratings with both Operative Performance Rating System items (each r > 0.90, all p's < 0.01) and Ottawa Surgical Competency OR Evaluation items (each r > 0.86, all p's < 0.01) was high.


The Zwisch scale can be used to make reliable and valid measurements of faculty guidance and resident autonomy. Our data also suggest that Zwisch ratings may be used to infer resident operative performance. Deployed on an automated smartphone-based system, it can be used to feasibly record evaluations for most operations performed by residents. This information can be used to council individual residents, modify programmatic curricula, and potentially inform national training guidelines.

PubMed ID 25192794

Full-text available on