Enhancing ICD-Code-Based Case Definition for Heart Failure Using Electronic Medical Record Data.
J Card Fail. 2020 Apr 15;:
Authors: Xu Y, Lee S, Martin E, D’Souza AG, Doktorchik CTA, Jiang J, Lee S, Eastwood CA, Fine N, Hemmelgarn B, Todd K, Quan H
BACKGROUND: Surveillance and outcome studies for heart failure (HF) require accurate identification of HF patients. Algorithms based on International Classification of Diseases (ICD) codes to identify HF from administrative data are inadequate due to their relatively low sensitivity. Detailed clinical information from electronic medical records (EMRs) is potentially useful for improving ICD algorithms. This study aimed to enhance the ICD algorithm for HF definition by incorporating comprehensive information from EMRs.
METHODS: The study included 2,106 inpatients in Calgary, Alberta, Canada. Medical chart review was used as the reference gold standard for evaluating developed algorithms. The commonly used ICD codes for defining HF were used (namely ICD algorithm). The performance of different algorithms using the free-text discharge summaries from a population-based EMR were compared with the ICD algorithm. These algorithms included a keyword search algorithm looking for HF-specific terms, a machine learning based heart-failure-concept (HFC) algorithm, an EMR structured data based algorithm, and combined algorithms (e.g., the ICD and HFC combined algorithm).
RESULTS: Of 2,106 patients, 296 (14.1%) were HF patients as determined by chart review. The ICD algorithm had 92.4% positive predictive value (PPV) but low sensitivity (57.4%). The EMR keyword search algorithm achieved a higher sensitivity (65.5%) than the ICD algorithm, but with lower PPV (77.6%). The HFC algorithm achieved a better sensitivity (80.0%) and maintained a reasonable PPV (88.9%) compared to the ICD algorithm and the keyword algorithm. An even higher sensitivity (83.3%) was reached by combining the HFC and ICD algorithms, with a lower PPV (83.3%). The structured EMR data algorithm reached a sensitivity of 78% and a PPV of 54.2%. The combined EMR structured data and ICD algorithm had a higher sensitivity (82.4%), but PPV remained 54.8%. All algorithms had specificity ranging from 87.5% to 99.2%.
CONCLUSION: Applying natural language processing and machine learning on discharge summaries of inpatient EMR data can improve the capture of HF cases compared to the widely used ICD algorithm. The utility of the HFC algorithm is straightforward, making it easily applied for HF case identification.
PMID: 32304875 [PubMed – as supplied by publisher]