Patient data including demographics, drug administrations, and cardiology information were extracted from MedStar clinical systems and prepared for analysis. We then performed data visualization, statistical modeling, and other analysis to identify the factors most predictive of cardiotoxic events. Our ultimate aim was to develop a framework for clinical decision support and precision medicine. Figure 1 shows our overall analysis workflow.
Clinical EHR Systems
To address the critical need to use EHR data to better understand trastuzumabrelated cardiotoxicity in breast cancer patients,we identifiedpatients with available diagnosis, lab, demographic and cardiology information. This required a cohort discovery strategy to identify data sources residing in disparate systems across the MedStar Health network.Our initial data source was ARIA, the oncology EHR systemthat contains patient demographics, diagnosis, lab results, drug orders, and clinical notes, among other data elements.The multi-modality image management system Xcelerawas used as a data source for echocardiogram data from MWHC.
Clinical Data Extraction, Filtering, Integration, and Cleaning
We investigated patients diagnosed with breast cancer who were treated with trastuzumab and had valid echocardiogram data from MWHC available for analysis. Figure 2illustrates our data extraction and filtering process.
In order to identify the patients in our cohort, we executed queries against ARIA using the ICD-9 diagnosis codes for female breast cancer (174.0, 174.1, 174.2, 174.3, 174.4, 174.5, 174.6, 174.8, and 174.9) identifying a set of 11,560 patients for further consideration. Next we queried the drug administration tables for these patients and determined that 702 of these patients received trastuzumab at a MedStar facility.
Using medical record numbers (MRNs) from these702 patients,we queried theMWHCXcelera system for LVEF,left ventricular dimensions and mass, and parameters of diastolic function. 307 patients had an MRN associated with MWHC, and we were able to obtain echocardiogram data for 160 of these patients.
Next, we identified a baseline LVEF measurement for each patient from an echocardiogram acquired within a period of two years prior to the first administration of chemotherapy. This required merging the drug administration information with data from echocardiograms and formulating temporal queries to ensure that the LVEFmeasurements occurred within a two year window prior to trastuzumab administration. Of the 160 patients we were able to identify 95patients with valid baseline LVEF measurementsand additional measurements after trastuzumab administration.
A patient was determined to have a cardiac event if the LVEF dropped below 50 and by more than 10% below baseline or if the LVEF dropped by more than 16% from baseline. This is consistent with clinical guidelines [11]. Using these guidelines, we identified 21 patients with cardiotoxic events.
We then produced a consolidated file containing the study data for downstream analysis. This required de-identification of PHI including patient MRNs,procedure dates, and the calculation of derived variables like age at baseline and time to cardiotoxic event.Table 1 compiles descriptive statistics about our patient cohort.