Patient Characteristics: In this retrospective study, a total of 3349 encounters from 3225 unique patients admitted to MICU at Emory University Hospital (Atlanta, GA) were selected from the derivation data for initial phenotyping. The cohort in our study consists of patients across a wide range of demographic variables, such as age (mean: 62.3 ± 15.3 years), sex (male: 53.1%), and race (Caucasian: 42.2%, African American: 47.4%). For validating our phenotyping algorithm, 867 encounters from N = 848 unique and diverse patients were selected from the MICU of Grady Memorial Hospital (Atlanta, GA). Characteristics of these cohorts are described in Tables 1 and 2. For validation of multi-ICU generalization, we used SICU patients from Emory [1128 encounters (N = 1112)] and Grady [466 encounters (N = 465)] hospitals, who required intubation even after 48 hours of completion of their surgeries. Characteristics of these SICU patients are available in the supplemental Tables E2 and E3.
Phenotyping Results: Our clustering algorithm for deriving enriched ARF phenotypes characterized by various risks profiles yielded four clusters on the derivation data with a silhouette score of 0.418, Calinksi-Harabasz score (variance ratio criterion) of 3555.87, and Davies-Bouldin score of 0.79. The optimal number of clusters was decided by achieving a combination of highest silhouette score, highest Calinksi-Harabasz score and lowest Davies-Bouldin score for the clustering. The patient distributions are shown in Figure 2 using a 2-D Uniform Manifold Approximation and Projection (UMAP) representing formed clusters and variations of important clinical features across these distributions. SHAP values were used to identify the important features that distinguished one cluster from another. SHAP plots are available in Figure E3 in the online data supplement. Subsequently, critical care physician experts helped interpret and characterize the four clusters as phenotypes based on their characteristics.
Table 1 summarizes the clinical and demographic variables for each of the four derived ARF phenotypes along with their mortality outcomes. The first phenotype (N=825 patients) has ARF patients with multiple laboratory abnormalities, such as highest median levels of creatinine (median: 3.47, IQR: 1.89–5.74 mg/dL), blood urea nitrogen (BUN) (median: 56, IQR: 34–80.25 mg/dL) and B-type natriuretic peptide (BNP) (median: 750.5, IQR: 251.25–1775.5 pg/mL). Based on the characteristics, we named this phenotype A (MOD-1) with severe multiple organ dysfunction (MOD) showing a high likelihood of kidney injury and heart failure. The second phenotype (N=689 patients) consists of patients with severe hypoxia and clinical characteristics suggestive of non-radiographic features of severe ARDS (low partial pressure of oxygen (PaO2) to fraction of inspired oxygen (FiO2) ratio [P/F ratio] [median: 123, IQR: 90–185 mmHg] and high FiO2 [mean: 0.8, IQR: 0.6–1]) and has the highest mortality (51%). We called it phenotype B (severe hypoxemic respiratory failure). The third phenotype (N=959 patients) consists of patients with no evidence of organ failure other than mild hypoxia (median P/F ratio: 240 [95% CI: 185, 317.7]) and normal lactic acid levels (median: 1.42 mmol/L). We called it phenotype C (mild hypoxia). The fourth phenotype (N=806) consists of ARF patients with highest total bilirubin (median: 1.2, IQR: 0.6–4.1 mg/dL) and highest D-dimer levels, lowest platelets (median: 118, IQR: 53–202.8 ×103/µL) and highest lactic acid (median: 2.35, IQR: 1.43–4.67 mmol/L) suggesting multi-system organ dysfunction. As such, we named this phenotype D (MOD-2) with severe MOD showing a high likelihood of hepatic injury, coagulopathy and lactic acidosis. From Table 1, we observed that phenotype B has the highest mortality (51%), followed by phenotype D (49.6%) and phenotype A (40.9%). The relatively healthier phenotype B had a mortality of 21.4%. Phenotype D is also characterized by the highest proportion of patients with septic shock (n=651 patient encounters (79.5%), whereas C consists of the lowest proportion of septic-shock patients (450 encounters, 45.3%). Thus, our phenotypes not only identified distinct patterns of organ injury in patients with sepsis-induced ARF, but also different rates of mortality and septic-shock distributions. To confirm more insights of MOD profiles in phenotypes, a set of all six individual SOFA were analyzed from the pre-intubation window for each phenotype. The maximum method was used for their aggregation. Supplemental Table E4 presents SOFA score-maps for Emory MICU data, where the findings clearly align with our phenotype characterizations.
Boxplots were drawn to illustrate the variabilities in certain prominent features such as creatinine (renal), total bilirubin (hepatic), P/F ratio and FiO2 (respiratory), BNP (cardiac), and platelets (coagulopathy), as shown in Figure 3(a)-(f). We also calculated the age-adjusted Charlson Comorbidity Index based on admission diagnosis ICD-9 codes for all four phenotypes, and they are 2.3 (95% CI: 2.21, 2.4), 1.94 (95% CI: 1.84, 2.05), 1.99 (95% CI: 1.9, 2.07) and 2.06 (95% CI: 1.97, 2.16), respectively.
Table 2: Summary of patient characteristics of the validation cohort (Grady MICU) and its phenotypes.
Parameters
|
Whole cohort
|
A
|
B
|
C
|
D
|
p-value
|
count (%)
|
867 (100)
|
214 (24.7)
|
49 (5.7)
|
404 (46.6)
|
200 (23.1)
|
-
|
Mortality*
|
294, 34.67%
|
80, 38.28%
|
34, 69.39%
|
83, 20.75%
|
97, 48.5%
|
-
|
Age, mean(std)
|
59.6 (15.1)
|
61.6 (13.9)
|
64.2 (13.1)
|
59.0 (16.1)
|
57.7 (14.3)
|
0.007
|
Males, count (%)
|
531 (61.2)
|
128 (59.8)
|
26 (53.1)
|
257 (63.6)
|
120 (60.0)
|
-
|
Race: African American or Black, count (%)
|
670 (77.3)
|
171 (79.9)
|
35 (71.4)
|
307 (76.0)
|
157 (78.5)
|
0.273
|
Race: Caucasian or White, count (%)
|
126 (14.5)
|
22 (10.3)
|
9 (18.4)
|
69 (17.1)
|
26 (13.0)
|
Ethnicity: Hispanic, count (%)
|
39 (4.5)
|
18 (8.4)
|
2 (4.1)
|
9 (2.2)
|
10 (5.0)
|
0.029
|
Ethnicity: Non-Hispanic, count (%)
|
818 (94.3)
|
194 (90.7)
|
46 (93.9)
|
389 (96.3)
|
189 (94.5)
|
P/F ratio, m(IQR)
|
267.5 [197.2,343.6]
|
300.6 [242.5,387.5]
|
104.0 [88.0,154.5]
|
256.7 [200.0,340.0]
|
272.0 [185.9,327.6]
|
<0.001
|
S/F ratio, m(IQR)
|
245.0 [200.0,250.0]
|
250.0 [242.5,250.0]
|
100.0 [94.8,106.6]
|
245.0 [227.5,250.0]
|
245.0 [198.0,250.0]
|
<0.001
|
FiO2, m(IQR)
|
0.4 [0.4,0.5]
|
0.4 [0.4,0.4]
|
1.0 [0.9,1.0]
|
0.4 [0.4,0.4]
|
0.4 [0.4,0.5]
|
<0.001
|
PaO2, m(IQR)
|
110.0 [87.0,141.0]
|
131.0 [99.0,159.0]
|
91.0 [75.0,108.0]
|
107.0 [85.0,136.0]
|
111.5 [87.0,138.0]
|
<0.001
|
PaCO2, m(IQR)
|
36.0 [31.0,41.0]
|
34.0 [30.0,39.0]
|
35.0 [30.0,45.0]
|
38.0 [33.5,43.0]
|
33.0 [29.0,37.0]
|
<0.001
|
MAP, m(IQR)
|
86.0 [78.0,96.0]
|
82.1 [75.0,89.9]
|
83.0 [77.5,92.0]
|
92.8 [85.0,103.1]
|
79.0 [74.0,86.0]
|
<0.001
|
Creatinine, m(IQR)
|
1.4 [0.9,2.8]
|
4.1 [1.7,6.9]
|
1.6 [1.1,3.1]
|
1.1 [0.8,1.6]
|
1.4 [0.8,2.4]
|
<0.001
|
Bilirubin total, m(IQR)
|
0.7 [0.5,1.4]
|
0.6 [0.4,1.1]
|
0.9 [0.5,2.5]
|
0.7 [0.5,1.2]
|
1.2 [0.6,3.7]
|
<0.001
|
Albumin, m(IQR)
|
3.0 [2.5,3.6]
|
2.9 [2.5,3.3]
|
3.0 [2.6,3.5]
|
3.5 [3.1,4.0]
|
2.2 [1.9,2.6]
|
<0.001
|
Lactic acid, m(IQR)
|
2.3 [1.7,3.7]
|
2.1 [1.6,3.2]
|
2.6 [1.9,4.3]
|
2.2 [1.7,3.3]
|
3.0 [2.0,5.0]
|
<0.001
|
D-dimer, m(IQR)
|
5220.0 [2041.0,15974.0]
|
5631.0 [2370.5,21648.0]
|
4898.0 [2668.0,10731.5]
|
3953.0 [1551.0,7646.0]
|
8468.0 [2529.4,22770.5]
|
<0.001
|
Platelets, m(IQR)
|
188.0 [119.0,260.5]
|
178.5 [110.8,255.5]
|
147.0 [97.0,235.0]
|
213.5 [150.8,276.2]
|
154.0 [83.5,232.5]
|
<0.001
|
Hemoglobin, m(IQR)
|
10.9 [8.7,12.9]
|
9.4 [7.9,11.4]
|
11.0 [8.9,12.7]
|
12.2 [10.9,14.0]
|
9.0 [7.7,10.4]
|
<0.001
|
BNP, m(IQR)
|
269.0 [105.0,873.5]
|
590.0 [247.0,1501.0]
|
222.0 [77.0,653.0]
|
213.0 [92.0,736.0]
|
215.5 [103.8,675.5]
|
<0.001
|
BUN, m(IQR)
|
27.5 [16.0,51.0]
|
62.0 [40.0,92.0]
|
34.0 [20.0,52.0]
|
19.5 [13.0,31.6]
|
23.0 [15.0,41.0]
|
<0.001
|
SOFA max total, m(IQR)
|
6.0 [4.0,9.0]
|
8.0 [6.0,10.0]
|
7.0 [5.0,10.0]
|
5.0 [3.0,7.0]
|
8.0 [5.0,10.0]
|
<0.001
|
GCS total score, m(IQR)
|
14.0 [11.0,15.0]
|
14.0 [10.0,15.0]
|
14.0 [12.0,15.0]
|
14.0 [10.0,15.0]
|
14.2 [12.0,15.0]
|
0.117
|
PEEP, m(IQR)
|
8.0 [5.0,8.0]
|
8.0 [5.0,8.0]
|
10.0 [8.0,10.0]
|
8.0 [5.0,8.0]
|
8.0 [5.0,8.0]
|
<0.001
|
For clinical variables, this table lists the medians and interquartile ranges (IQR: Q1-Q2) for each phenotype as well as for the whole cohort. The p-value is also provided for each variable to indicate the statistical significance of the differences among the phenotypes. For evaluating statistical significance, Kruskal-Wallis test was performed for continuous variables and Chi-squared test was used for categorical variables. *Mortality was computed with respect to patients (not encounters). Abbreviations used — count: total encounters, mean: average, std: standard deviation, m: median, IQR: interquartile range, PaO2: partial pressure of oxygen, SpO2: peripheral oxygen saturation level, FiO2: fraction of inspired oxygen, P/F: PaO2/FiO2 ratio, S/F: SpO2/FiO2 ratio, PaCO2: partial pressure of carbon dioxide in arterial blood, MAP: mean arterial blood pressure, Resp.: respiration, BNP: B-type natriuretic peptide, BUN: blood urea nitrogen, SOFA: sequential organ failure assessment, GCS: Glasgow coma scale. Measurement units — P/F ratio, PaO2, PaCO2, and MAP: mmHg; S/F ratio and FiO2: unitless; creatinine and bilirubin total: mg/dL; albumin: g/L; lactic acid: mmol/L; D-dimer: ng/mL; platelets: ×103/µL; hemoglobin: g/dL; BNP: pg/mL; BUN: mg/dL.
|
External and Multi-specialty Validation of Sepsis-induced ARF Phenotypes: To validate our phenotyping algorithm, we utilized an external hospital’s MICU cohort from Grady Memorial Hospital, Atlanta, GA. Our methodology involves training a supervised learning (logistic regression) classifier on the derivation dataset to predict the corresponding phenotype. Thereafter, we employed the trained model on the validation dataset to determine the phenotype for each patient encounter. We summarize the phenotype validation results in Table 2. These results indicate that most features across the four phenotypes remain consistent in the validation dataset, highlighting the reliability and generalizability of our phenotyping approach.
For further analysis of the phenotypes, Figure E1 in the online data supplement shows radar diagrams illustrating average variations of all clinical feature values across four formed phenotypes of the derivation and validation data, where all features are normalized in the range 0-1. Additionally, radar diagrams in Figure E2 in the online data supplement presents distributions of demographic variables and mortality outcomes across various phenotypes of the derivation and validation data. Additionally, our phenotyping approach was also validated on SICU cohorts of both Emory and Grady hospitals. Their phenotyping results are listed in the supplemental Tables E2 and E3. We also analyzed aggregated pre-intubated individual SOFA for these datasets, and the results are listed in Supplemental Tables E5-E7. They show consistency in earlier results obtained from the derivation data.
Short-term Survival Analysis: Trajectory of short-term outcomes can provide a better differentiation among phenotypes. For a 28-d short-term analysis, average vent-free days (VFD) were found as 10.4, 8.6, 15.4 and 8.5, respectively for phenotypes A to D of the derivation set. To evaluate the survival probability of patients in each phenotype, we plotted Kaplan-Meier curves [17] for a 28-day period following intubation, as shown in Figure 4. The analysis was performed for derivation and validation datasets, where survival traces of phenotype D of Emory SICU (N=12) and B of Grady SICU (N=5) were omitted here due to having their small sample sizes. We observed that the mortality trends across various phenotypes were consistent for MICU and SICU of both centers (p-value for trend < 0.001), with phenotype C having the best survival followed by A, and phenotypes B and D having the poor survival rates in both centers. This suggests that our phenotyping approach is generalizable in identifying the least and the most critical phenotypes in terms of short-term survival for ARF patients with different demographic characteristics.
Exploratory Analyses of Clinical Differences among the Phenotypes: We also performed exploratory analyses to examine whether the phenotypes would demonstrate different outcomes or clinical patterns in relation to high PEEP (PEEP ≥ 10) treatments. Within phenotypes, 16.7% in A, 49.6% in B, 24% in C, and 16.2% in D were administered with PEEP ≥ 10 regime on mechanical ventilator. We conducted an analysis to estimate the effects of high PEEP (PEEP ≥ 10) using a propensity score matching scheme on 28-day short-term mortality, by considering lab-values and demographics as confounding variables. Average treatment effects (ATE) with 95% confidence intervals were obtained as 0.04 (-0.08, 0.16) for A, -0.05 (-0.13, 0.02) for B, 0.07 (-0.01, 0.15) for C, and -0.07 (-0.19, 0.05) for D. A negative ATE suggests reduced mortality outcomes for the treated group. We also plotted Kaplan-Meier curves between patients who received high PEEP and those who did not within each of the phenotypes.
In phenotype B with severe hypoxic respiratory failure, higher PEEP (≥10) was associated with significantly better survival than lower PEEP (<10), but the opposite association was seen in phenotype C. Among both MOD phenotypes, higher PEEP was found effective for D, whereas it was ineffective for A. However, the PEEP strategy was not significantly associated with survival in both these phenotypes (Figure 5). We must emphasize that this analysis was purely exploratory in nature and was carried out to examine the feasibility of further research on the treatment effects of various therapies.
Validation of the phenotypes against the Hyper/Hypo Inflammatory Phenotypes: We further sought to investigate how the phenotypes derived from the results above compared to the binarized phenotypes, namely the hyperinflammatory and hypoinflammatory phenotypes.[13,14] By comparing the clinical values reported, our results suggested that patients in phenotype A (MOD-1) and D (MOD-2) were most likely associated with hyperinflammation characterized by high values of total bilirubin (mean A:1.4, D:4.8 mg/dL) and creatinine (mean A:4.3, D:2 mg/dL), and low values of platelet count (mean A:191, D:147 ×103/µL), bicarbonate (mean A:22.4, D:20.7 mmol/L), PaCO2 (mean A:38.6, D:34.2 mmHg) and hemoglobin (mean A:9.2, D:9 g/dL), which were consistent with the values of these markers in the hyperinflammatory subphenotype from the previous works [13,14]. Phenotype B with severe hypoxemic respiratory failure demonstrated features that were consistent with neither hyper- nor hypo-inflammatory phenotype, suggesting that this phenotype could either consist of a mix of both phenotypes or represent a completely novel phenotype. On the contrary, patients in C were associated with hypoinflammatory characteristics with lowest values of total bilirubin (mean:1.1 mg/dL) and creatinine (mean:1.4 mg/dL), and highest values of platelet count (mean:231×103/µL), bicarbonate (mean:26 mmol/L), PaCO2 (mean:42.5 mmHg) and hemoglobin (mean:11.6 g/dL). In comparison to the earlier works, the patient population and variables included were not the same. For example, we did not use biologically derived features such as interleukin-6/8 and intercellular adhesion molecule 1. Hence, this characterization of hyper/hypo-inflammatory subgroups in our identified sepsis-induced ARF phenotypes needs further investigation.
Practice Variance During COVID-19: When evaluating the derivation strategy independently during the 2020-2021 data, we found that they were consistent with that of pre-COVID-19 years, without significant variance in the distribution of the phenotypes. Relevant details on the sensitivity analyses are available in Figure E6 in the online data supplement.