11.15 A Rule-Based Natural Language Processing Pipeline for Anesthesia Classification from EHR Notes

A. J. Nastasi1, S. Bozkurt1, M. Manjrekar1, C. Curtin3,5, T. Hernandez-Boussard1,2,3,4  1Stanford University,Center For Biomedical Informatics Research,Palo Alto, CA, USA 2Stanford University,Department Of Biomedical Data Science,Palo Alto, CA, USA 3Stanford University,Department Of Surgery,Palo Alto, CA, USA 4Stanford University,Department Of Medicine,Palo Alto, CA, USA 5VA Palo Alto Healthcare Systems,Department Of Surgery,Palo Alto, CA, USA

Introduction:
Given that different types of anesthesia vary by physiologic mechanism, they have different associations with outcomes like postoperative pain and delirium. Despite this, in clinical practice, many are choosing anesthesia based on personal preference or simply what is available rather than considering the effects on important surgical outcomes, as strong evidence and clear guidelines are lacking, which indicates a strong need for further research as to the best anesthesia type in a particular clinical situation. However, given that there are no structured codes associated with anesthesia types, leveraging the immense amount of data in electronic health records (EHRs) to study the impact and outcomes associated with an anesthesia type typically requires time-consuming manual chart review, limiting our ability to study and understand the effects of anesthesia type on post-operative outcomes. To address this methodologic challenge, we hypothesized that a simple, rule-based natural language processing (NLP) pipeline could automatically classify the anesthesia used for a surgery using only free text from EHR notes.

Methods:
A rule-based NLP pipeline was developed in Python 3.6 to determine types of anesthesia (general, regional, local) using a clinical note associated with an operation. The pipeline first pre-processed (lowercasing, removing punctuations, etc.) the operative notes. Then, to extract the context of interest of the report, text between the anesthesia type header and the next header was extracted, if present. If not, other parts of the report were included by checking for the presence of other headers, sentence delimiters, and “anesthesia” itself. For classification, extracted contexts were matched via dictionary mapping with target terms (e.g., “GET”) and their relevant anesthesia type (e.g., general anesthesia) based on a versatile lexicon built with clinical and domain knowledge. The pipeline was first tested on a sample of 100 post-operative notes from EHRs at an academic medical center, annotated by a clinician. The classifier was then improved and re-assessed for several iterations until satisfactory performance was achieved with accuracy metrics including recall, precision, and F1 score.

Results:
On the 100 annotated validation notes, the classifier perfectly classified regional anesthesia and nearly perfectly classified general anesthesia with F1 scores of 1 and 0.99, respectively (Table). Local anesthesia was classified with a recall, precision, and F1 score of 0.89, 0.93, and 0.91, respectively. Overall, the classifier had a recall, precision, and F1 score of 0.96, 0.98, and 0.97, respectively.

Conclusion:
Our rule-based NLP classifier successfully classified unstructured free text of clinical EHR notes by type of anesthesia administered, allowing for the efficient study of the effect of anesthesia types and combinations on post-operative outcomes as well as the development of evidence-based anesthesia guidelines.