J. Aalberg1, R. Miller1, D. Gowie1, C. Divino1 1Mount Sinai School Of Medicine,New York, NY, USA
Introduction:
As hospital payments become more closely tied to performance and quality, physicians and healthcare leaders have grown increasingly interested in risk-adjustment. This statistical methodology allows for a direct comparison of adverse event rates and costs by taking into account differences in the relative health of the populations being served. While risk-adjustment has become commonplace at the national level, such as in the National Surgical Quality Improvement Program or as part of agreements between Accountable Care Organizations and payers, there exist no robust implementations of such a system at a truly local level. Current work has been limited by the use of multi-institutional data, which increases effects by unmeasured confounders, and a lack of a subsequent reliability-adjustment to correct for sample size.
Methods:
Data was collected for all patients undergoing surgery in six closely related surgical departments at a large tertiary care center between 2013 and 2016. Risk and reliability-adjusted scores were created for the occurrence of any postoperative complication and ten specific adverse events using multivariate logistic and hierarchical models with Empirical Bayes adjustment for physician caseload. Estimates of the number of patients needed for an average provider to attain an adequate level of reliability were then created. Lastly, performance quintiles between raw and adjusted scores were compared to examine the effect of adjustment the identification of high and low performing physicians.
Results:
‘Any complication’ demonstrated sufficient theoretical statistical reliability with only 161 cases. Anastomotic leak, surgical site infection, and mortality demonstrated high reliability with 357, 563, and 1,057 cases respectively. Analysis of outer quintiles showed that risk and reliability-adjustment did not markedly change the physicians identified as positive outliers, but 35%-85% of negative outliers were only identified as such after full adjustment.
Conclusion:
The use of risk-adjustment to categorize surgeons is indicated by low levels of concordance between raw and adjusted scores. Importantly, adjustment is most crucial when identifying poor-performing outliers- those who would most benefit from increased attention by a quality improvement department. Secondly, by examining the impact of adjustment on a variety of post-operative events, we demonstrate that risk-adjustment is most practical when examining overall complication rates. To create reliable estimates of performance for this outcome, only 6-12 months of data are needed for each surgeon, allowing for routine distribution of such scores. These results demonstrate that it is feasible to translate the risk-adjustment methodologies refined on the national scale, to within a single institution. Our experience should encourage others interested in data-driven analyses of internal surgical performance to consider risk-adjustment as an additional tool.