J. N. Whitrock2, C. G. Pratt2, M. M. Carter2, A. D. Price2, R. C. Chae2, R. M. Van Haren1,2, R. C. Quillin1,2, C. F. Justiniano1,2, L. S. Silski1, S. A. Shah1,2 1University Of Cincinnati, Surgery, Cincinnati, OH, USA 2University Of Cincinnati, Cincinnati Research On Outcomes And Safety In Surgery (CROSS) Research Group, Cincinnati, OH, USA
Introduction: As artificial intelligence (AI) software becomes more commonplace, it is inevitable that this resource will be utilized by medical students (MS) to assist in writing various aspects of their residency application, including the personal statement. This study sought to evaluate the ability of residency application reviewers to discern AI-generated personal statements from MS-generated statements and to understand reviewers’ perceptions of this practice. We hypothesized that AI statements would be identifiable.
Methods: Three personal statements were generated using AI by a novice AI user, and three were written by MS who matched into surgery residency in previous years. In random order, all six personal statements were distributed to surgical faculty and residents involved in the application review process at our institution. Participants were instructed to independently read all statements and attempt to identify which were AI versus MS written. In addition, participants completed a survey which explored their opinions regarding AI utilization when writing personal statements.
Results: Of the 30 participants in this study, 50% were faculty (n=15) and 50% were residents (n=15). Experience was defined by number of years one has participated in the application-reading process and ranged from 0-20 years (median 2 years, IQR 1-6.25 years). AI-derived personal statements were identified correctly 59% of the time, with only 3 participants identifying all the AI-derived personal statements correctly. AI-generated personal statements were labeled as the best personal statement 60% of the time (Figure) and the worst personal statement only 43.3% of the time. When asked if AI should be allowed to be used in the application writing process, 66.7% (n=20) of participants said no, 30% (n=9) said yes, and 3.3% (n=1) had a neutral opinion. When asked if the use of AI would impact their opinion of the applicant, 80% (n=24) said yes, and 20% (n=6) said no. When these questions and ability to correctly identify AI-generated personal statements were evaluated by faculty/resident status and by years of experience, no differences or trends were noted (p>0.05).
Conclusion: This is the first study to show that surgical faculty and residents cannot reliably distinguish between AI- generated personal statements and MS-generated personal statements. This study also found that AI-generated personal statements were more likely to be rated as the best while statements written by MS who successfully matched into general surgery residency were more likely to be labeled as the worst. These novel findings highlight the pressing need to reach a consensus regarding the use of AI as a tool in the residency application process.