J. Guo1, K. S. Gillani1, C. Chai1, J. Walters2, M. Brett1, S. Sorek1, J. Hausner1, S. DiRusso1,2 1New York College Of Osteopathic Medicine, Old Westbury, NY, USA 2St. Barnabas Hospital, Surgery, Bronx, NY, USA
Introduction: Laparoscopic cholecystectomy (LC) is a common surgical procedure for gallstone disease. The critical view of safety (CVS) is crucial for successful outcomes, but misidentification of anatomical landmarks can lead to complications such as common bile duct injury (CBDI). The use of artificial intelligence (AI) tools, such as JARVIS, in guiding surgical decision-making during LC is an emerging area of interest. This study aims to evaluate JARVIS’s utility in identifying critical landmarks against attending surgeons' expertise and obtain insight into surgeon’s decision-making process during this critical step.
Methods: Twenty-five intraoperative photos of CVS were analyzed by both JARVIS to produce a numerical score and eight attending surgeons using a 6-point scale. Surgeons also decided whether or not to clip the cystic duct and artery based on each image alone and explained their reasoning if they chose not to. An average agreement between surgeons of >70% and a >70% value from JARVIS were used as cut-offs to represent a positive response to clipping. Congruence between JARVIS analysis and surgeon evaluations was then assessed, and statistical analyses were performed to detect outliers.
Results: JARVIS demonstrated congruence with surgeon analysis on 17 of the 25 images (68%). There was a higher degree of agreement on images where surgeons agreed it was safe to clip the cystic duct and artery (71.43%) than on images where surgeons agreed not to clip (66.67%). Visualization of raw data also showed that surgeon respondents agreed unanimously on six images. Of those six images, only one image was deemed safe to clip by all surgeons.
Conclusion: The study challenges JARVIS’s potential as an adjunct to human interpretation in identifying critical landmarks during LC. There were inconsistencies between JARVIS analysis and surgeon evaluations, and using a cut-off of 70% did not significantly improve agreement. Additionally, patterns of disagreement amongst the surgeons reflect a degree of surgical intuition used intraoperatively. The explanations surgeons gave for not clipping demonstrates how each surgeon differs in their perception of a dissected structure and how elements outside of the CVS—such as camera angle and placement of tools—are factored into their decision. Nevertheless, the ability of AI tools to replicate human judgment or aid in trainee development of surgical intuition are important future considerations. Further studies with larger sample sizes and with surgeons at different levels of training are planned to gain deeper insights into surgeons' decision-making processes.