In this retrospective cohort study, we developed and validated a machine learning model for the early detection of VAP in patients 24 h before the diagnosis. The final predictive AUC showed a good performance (AUC: 84.4%, sensitivity: 74.3%, specificity: 70.7%), as an AUC value between 0.75 and 0.92 indicates good diagnostic capability(16). Additionally, our VAP machine learning model improved the AUC of the CPIS-based model by almost 25%, and the sensitivity and specificity were improved by almost 14% and 15%, respectively. Our predictive model can provide risk stratification for VAP patients within independently-defined patient groups. Prevention guidelines have been developed to allow higher-risk patients to benefit from more aggressive strategies or adjuvant therapy (semirecumbent position, oral hygiene). Additionally, a longer prediction lead time will increase the likelihood that a patient can benefit from early intervention.
The CPIS is a method used to diagnose VAP and in timely manner. There are several clinical indicators in the CPIS that describe VAP; therefore, this score could be used as a reference to help physicians provide better and faster treatments for patients. According to the proposed model based on the CPIS for the early detection of VAP, when the threshold was equal to 6, a clear difference could be observed between the existence and non-existence of pulmonary infection(17). However, in our MIMIC III cohort data, a CPIS score of 6 did not show a good performance similar to that mentioned in the reference article. In contrast, when the CPIS was 6, the CPIS-based model exhibited the worst performance. Thus, different score thresholds were tested to determine the best performance; Figure S2 shows that when score was equal or greater than 3, the CPIS-based model had the best performance.
The typical acute respiratory distress syndrome (ARDS) manifestations include increased pulmonary vascular permeability, pulmonary edema and alveolar trapping, which lead to refractory hypoxia and decreased pulmonary compliance (18). The optimal mechanical ventilation strategy for these patients is to decrease tidal volume and increase positive end-expiratory pressure, which is associated with the highest PaO2/FiO2 ratio(19–21). The relationship between ARDS and subsequent development of VAP is complex. In mechanically-ventilated patients, cyclic stretch of lung cells induces acidification of the milieu, which promotes bacterial growth(22). Injurious mechanical ventilation may promote the lungs to release cytokines(23, 24). In addition, alveolar macrophages and neutrophils exhibit reduced bacterial phagocytosis and killing, thereby affecting lung and systemic antibacterial defenses(23, 25).
We found that APACHE III and SOFA scores greatly contributed to the final predictive model. The APACHE scoring system is used to describe the severity of illness and predict the outcome of critically ill patients. The APACHE II and III are widely employed scores in the ICU(26, 27), and the overall goodness-of-fit of the two predictive models was similar. APACHE III expanded the acute physiology score (APS) project compared to APACHE II, and based on APACHE II, APS added six parameters: blood urea nitrogen (BUN), total bilirubin, blood glucose, ALB, artery CO2 partial pressure (PaCO2) and urine output. These six parameters are more responsive to clinical practice(28, 29). The APACHE II was better at predicting risk for surgical patients and patients with gastrointestinal disease(28), while the APACHE III score was a good predictor of internal medical conditions and nosocomial pneumonia(29, 30).
In our study, the control group included patients with mechanical ventilation for 24 h rather than patients with 48 h of mechanical ventilation. The reasons are as follows: we selected the worst values of the body temperature, PaO2/FiO2 ratio, and WBC during the initial 24 h after ventilation and the worst values of the APACHE III and SOFA scores in the first 24 h after admission to the ICU as VAP predictors. If we had included patients with 48 h of mechanical ventilation in the control group, some non-VAP patients would be missed. The purpose of the model is to predict whether VAP can occur in patients with mechanical ventilation, which is more consistent with our original intention and clinical reality. Additionally, some references support this grouping scheme(31).
The major limitation of this study was the annotation of VAP sessions. We annotated the VAP session by the VAP definition, i.e., ventilation sessions that were over 48 h and with pneumonia after 48 h of ventilation. With this strategy, we could not only identify VAP sessions but also query the recorded time in the chart event table from the MIMIC-III database. The limitation of this annotation procedure was the high false negative rate due to potentially less VAP diagnoses recorded in the chart event table by the nurse. Another protocol of VAP annotation is to use the ICD-9 code in MIMIC-III. However, in MIMIC-III cohort, since the diagnostic information did not link to the exact diagnosis time, we were unable to query the precise PaO2/FiO2 ratio, WBC, and body temperature variables.
In future work, to overcome the limitation of annotation, we need to define a protocol that will collect information on not only VAP diagnosis but also the charting time of VAP diagnosis. Further, external validation and prospective interventional or outcome studies using this prediction model are envisioned as future work.