C. E. Hunter1, Z. M. Saenz1, D. Nunez1, L. Timsina2, B. W. Gray1 1Indiana University School Of Medicine,Division Of Pediatric Surgery, Department Of Surgery,Indianapolis, IN, USA 2Indiana University School Of Medicine,Center For Outcomes Research In Surgery, Department Of Surgery,Indianapolis, IN, USA
Introduction:
The Congenital Diaphragmatic Hernia Study Group (CDHSG) registry is a vital multi-institutional tool to help track outcomes of CDH patients to improve prognosis and patient care. The CDHSG asks surgeons to categorize each patient’s diaphragmatic defect size as grade A, B, C, or D based on published guidelines. A reliable grading system of these defects is important for individual patient prognosis and clinical research. The reported size of the defect has been correlated with patient outcomes, such as survival. However, the inter-rater reliability of this system has not been evaluated. The goal of this study was to evaluate the inter-rater reliability of the CDHSG grading system.
Methods:
The operative notes from patients that underwent surgical repair of a unilateral CDH at a single institution between 2010 and 2016 were collected. Forty-six operative notes were cropped to include only the information necessary to grade the hernia defect A-D based on the CDHSG grading system guidelines. The defects were graded by 9 pediatric surgeons of differing experience levels, and the inter-rater reliability was determined by calculating a Cohen’s kappa (κ). The following cutoffs were used to interpret κ: ≤ 0 – no agreement, 0.01-0.20 – none to slight agreement, 0.21-0.40 – fair agreement, 0.41-0.60 – moderate agreement, 0.61-0.80 – good agreement, 0.81-1.00 – very good to perfect agreement. Data was also collected on intraoperative findings (liver up vs. down, ECMO status, need for patch repair) and patient outcomes (length of stay, survival).
Results:
Overall, there was 57.49% agreement across all raters, corresponding to a fair level of agreement (κ=0.395, p<0.001). Between any two raters, agreement ranged from no agreement (21.74% agreement, κ= -0.027) to good agreement (82.61% agreement, κ= 0.7543). All 9 of the surgeons agreed in only 2 of the 46 patients, both of which were assigned an “A” grade. Four patients received 3 different grades: 3 received grades A, B, and C, and 1 received grades B, C, and D. No patients were given all four grades. Overall, there was 87% survival (n=40). Inter-rater agreement was similar despite different operative findings and outcomes (p > .05): survival yes/no (κ=0.3690, κ=0.3518), need for ECMO yes/no (κ=0.3323, κ=0.3362), patch repair yes/no (κ=0.2050, κ=0.1916), and liver up/down (κ=0.2941, κ=0.4404).
Conclusion:
This single institution study shows that while the CDHSG grading system is not random, it produces only a fair amount of agreement between pediatric surgeons when grading the severity of hernia defects. Patient outcomes and intraoperative findings do not affect levels of agreement. Future research will examine intra-rater reliability of this system and will work to provide a more reliable system for grading the severity of CDH defects.