Demographic data
A total of 367 patients were included in this study. The median age was 74 years (11-100 years). 247 patients were male (67.5%). Compared with patients without COVID-19, patients with COVID-19 were older (79 vs. 70, P<0.001) and had more days of hospital stay (13 vs. 11, P<0.001). Most of the patients had background diseases (>50%), including hypertension (50.5%), diabetes (35.2%), cardiovascular and cerebrovascular diseases (21.6%), chronic kidney disease (12.6%), and chronic lung disease (8.5%). Patients with COVID-19 were more likely to have diabetes (39.9% vs.29.8%, P=0.032). All the pneumonia patients had the experience of receiving antibiotics, corticosteroids, antiviral agents, antifungal agents, and mechanical ventilation alone or in combination throughout their hospitalization. Patients with COVID-19 were more likely to be treated with antiviral drugs (47.7% vs 11.2%) and corticosteroids (74.1% vs 36.5%) (P<0.001). Detailed information was given in Table 2.
Outcomes
In this study, the mortality rate of COVID-19 combined with SBI was 40.0% (n = 80) as shown in Table 2. Among them, 40 cases (50.0%) died of bacterial infection, 27 cases (33.8%) died of fungal infection, and 13 cases (16.2%) died of bacterial and fungal coinfection. Mortality was increased in patients with COVID-19 than in patients without COVID-19 (40.0% vs. 29.9%, P=0.046).
Pathogen profiles
Candida spp. (175 cases, 91.4%) and Aspergillus spp. (16 cases, 8.3%) were the most common fungi isolated from sputum samples of 192 patients. There was no significant difference in the incidence of Candida and Aspergillus infection between patients with or without COVID-19 (Table 3).
Bacterial pathogens were isolated from 211 patients. The main strains were A. baumannii (93 cases, 44.1%), K. pneumoniae (83 cases, 39.3%), P. aeruginosa (42 cases, 19.9%), S. aureus (14 cases, 6.6%). The isolation rate of A. baumannii was higher in patients with COVID-19 than in those without COVID-19 (51.4% vs. 36.3%, P=0.027). There was no significant difference in the incidence of S. aureus infection between patients with or without COVID-19 (6.4% vs. 6.8%, P=0.898). The isolation rate of multi-drug resistant bacteria increased in COVID-19 patients. The detection rate of carbapenem-resistant A. baumannii (CRAB) in COVID-19 patients was higher than in those without COVID-19 (94.6% vs. 86.5%, P=0.011). For the COVID-19 patients, the resistance rates of CRAB to ceftazidime, ciprofloxacin, gentamicin, piperacillin-tazobactam were all more than 90%. Fortunately, most of the isolated CRAB were sensitive to tigecycline (90.2%). The isolation rates of carbapenem-resistant K. pneumoniae (CRKP) and carbapenem-resistant P. aeruginosa (CRPA) were similar (26.8% vs 26.2% and 28.6 vs 38.0%, respectively) between the two groups of patients. The detailed information was shown in Tables 4-6.
Clinical laboratory test results
In Table 7, it was clear that white blood cells, neutrophils, lymphocytes, interleukin-1, interleukin-6, interleukin-8, and TNF were lower (P<0.05) in the patients with COVID-19 compared to in the patients without COVID-19.
Construction of the prediction models
Because the DT and CatBoost models were not accurate and stable, they were excluded initially. After every 10 iterations for each model, boxplots (Fig.. 2) were used to comprehensively compare the stability and performance of the left five models. The AUC values of the five models were all above 0.6, and the median values were all above 0.7. Specifically, the XGBoost model exhibited the shortest variation distribution range in the boxplot (excluding the outlier with an AUC value of 0.6310). Its median AUC value was 0.7761, indicating that this model was more stable compared to others. The stabilities of LR model and RF model were ranked as second and third. Although SVM had the highest AUC value (0.9464), its deviation distribution was wide, indicating that the stability was relatively poor. Although the difference in AUC values amid the models was not significant, XGBoost model and LightGBM model had superior advantages when considering computational efficiency, dealing with complex relationships, providing feature importance information and parameter tunable range (data not shown). Therefore, we further selected the XGBoost model and LightGBM model for BO to ensure better prediction performance in practice.
Given that LightGBM and XGBoost had a large space for parameter optimization, we further combined each model with the TPE method for hyperparameter optimization. For both LightGBM and XGBoost models, the accuracy of the BO-optimized models was higher than that before optimization (Fig. 3). The AUC value of XGBoost increased from 0.8960 to 0.9493 (Fig. 4a), and the AUC value of LightGBM increased from 0.8972 to 0.9699 (Fig. 4b). Finally, the prediction accuracy of both models reached up to 90% tested by the validation data (Fig. 3).
The SHAP summary plots based on the optimal parameter combinations identified the key risk factors dominating the predictive model (Fig. 5). Each point represented a sample, and the color of the point represented the relative significance of the eigenvalues, with red indicating high eigenvalues and blue indicating low eigenvalues. Taking whether to use ventilator treatment as an example, a large number of red samples were clustered in the area with negative SHAP values, which meant that if the patient received ventilator treatment (marked as 1), SHAP values would be low. The magnitude of SHAP values indicated the degree of influence on the prediction results. The greater the absolute values of SHAP, the greater the impact of this variable on the outcome of the patient’s survival. Fig. 6 showed the variable importance changes for each established model before and after optimization using the 49 indicators (variables). The greater the importance value of the variable, the greater the impact on the survival rate of the patients. In this light, intensive care unit admission, days of hospital stay, ventilator use, carbapenem use, lymphocyte count, AST value, A. baumannii infection, and Candida infection had important effects on patient survival.