Predicting 90-day mortality in patients with HBV-ACLF using machine learning tools

doi:10.21203/rs.3.rs-5289373/v1

Download PDF

Research Article

Predicting 90-day mortality in patients with HBV-ACLF using machine learning tools

https://doi.org/10.21203/rs.3.rs-5289373/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background

Acute chronic liver failure (ACLF) is characterized by a systemic inflammatory response, mainly associated with hepatitis B virus (HBV) in the Asia-Pacific region, and has a high mortality rate. We aimed to develop a stable and feasible prognostic prediction model based on machine learning (ML) tools to predict 90-day mortality in patients with hepatitis B virus-associated acute-on-chronic liver failure (HBV-ACLF).

Method

Clinical data from 573 patients with HBV-ACLF across two hospitals were retrospectively collected. Prognostic models of HBV-ACLF were constructed using support vector machine (SVM), decision tree (DT), random forest (RF), K nearest neighbour (KNN), least absolute shrinkage selection operator (LASSO), and logistic regression (LR). Model performance metrics included accuracy, area under the (AUC) receiver operating characteristic curve, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Results

In the training cohort, the RF prediction model demonstrated significantly higher AUC, sensitivity, specificity, PPV, and NPV than the LASSO, LR, SVM, DT, and KNN prediction models. However, the AUC of RF in the validation cohort was 0.728, with a decline in accuracy, specificity, and PPV to 0.688, 0.545, and 0.655, respectively. In the training cohort, the LASSO model had the lowest PPV at 0.739, while the KNN model had the lowest sensitivity at 0.694. In the testing and validation cohorts, the SVM and DT models exhibited the lowest sensitivity, both at 0.581. Although LR performed less effectively than RF in the training cohort, it outperformed the RF model in the testing and validation cohorts.

Conclusions

In summary, the LR predictive model demonstrates higher predictive efficacy and greater stability, making it more practical for clinical treatment decision-making.

Acute chronic liver failure

90-day mortality

Machine learning

Prognosis

Acute chronic liver failure (ACLF) is characterized by deterioration of liver function due to an acute precipitating event or unrecognizable trigger in the context of underlying chronic liver disease. It is a short-term clinical syndrome of acute or subacute hepatic failure, with key manifestations including coagulation disorders, jaundice, ascites, and hepatic encephalopathy (HE)[1–4]. Chronic hepatitis B virus (HBV) infection is the primary cause of ACLF in the Asia-Pacific region, including China, exceeding more than 60% of total cases[5]. Current effective treatments for ACLF include managing precipitating events, organ support, and liver transplantation. Despite intensive hospital care, the 28-day and 90-day mortality rates for patients with ACLF are 37.6% and 50.4%, respectively[6]. However, ACLF remains a reversible syndrome, with the potential for recovery. The greatest opportunity to improve survival lies in early diagnosis and accurate prognosis, enabling optimal disease management.

The current models used to evaluate the prognosis of ACLF, including the Child–Turcotte–Pugh score[7], chronic liver failure-sequential organ failure assessment[8], and models for end-stage liver disease (MELD)[9] and their derived scores[10, 11], have limitations such as insufficient accuracy, complex calculations, and unclear assessment value. Machine Learning (ML), an important branch of Artificial Intelligence, is not only unconstrained by data distribution, but also manages complex relationships and high-dimensional data[12, 13]. ML is increasingly applied in hepatology and ACLF, with studies exploring its use in prognostic prediction for patients with ACLF[14–16]. Therefore, the aim of this study was to develop a stable and feasible prognostic prediction model based on ML tools to predict 90-day mortality in patients with hepatitis B virus-associated acute-on-chronic liver failure (HBV-ACLF).

Study cohort and data collection

A total of 577 patients diagnosed with HBV-ACLF who were admitted to the First Affiliated Hospital of Nanchang University and Jiangxi Provincial People's Hospital between January 2016 and October 2022 were retrospectively included. Of these, 513 patients from the First Affiliated Hospital of Nanchang University were randomly divided into a training group of 308 patients and a testing group of 205 patients, based on a 6:4 ratio. An additional 64 patients from Jiangxi Provincial People's Hospital were used as an external validation group. Patients with HBV-ACLF were recruited based on the Asia–Pacific Association for the Study of the Liver criteria, which require hepatitis B surface antigen positivity for at least 6 months, a serum total bilirubin (TBIL) level of ≥ 5 mg/dL, an international normalised ratio (INR) level ≥ 1.5, or plasminogen activity of < 40%, along with the presence of ascites and/or HE within 4 weeks. The inclusion criteria were as follows: 1) age ≥ 16 years old; 2) diagnosis of HBV-ACLF; and 3) availability of clinical information. The exclusion criteria were as follows: 1) HBV-ACLF coexisting with other chronic liver diseases; 2) liver tumours or other malignancies; 3) severe chronic extra-hepatic diseases; 4) previous liver transplantation; and 5) human immunodeficiency virus infection or use of immunosuppressive drugs. This retrospective study was approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University and Jiangxi Provincial People's Hospital, and the need for informed consent was waived.

Data preparation, feature selection, and model training

The workflow for the development of ML models is illustrated in Fig. 1. Thirty-three characteristics of patients with HBV-ACLF were retrospectively collected for analysis, including gender, age, weight, hypertension, diabetes, cirrhosis, portal hypertension, HE, hepatorenal syndrome (HS), gastrointestinal bleeding (GIB), ascites, primary peritonitis, electrolyte disorders, number of artificial liver treatments, hormonal treatments, and admission laboratory tests, namely, white blood cell (WBC) count, haemoglobin (HB) levels, platelet (PLT) count, alanine aminotransferase (ALT) levels, aspartate aminotransferase (AST) levels, TBIL levels, direct bilirubin levels, albumin (ALB) levels, serum creatinine (SCR) levels, glucose levels, triglyceride (TL) levels, total cholesterol (TC) levels, prothrombin time (PT), prothrombin activity (PTA), INR, fibrinogen levels, D-dimer levels, hepatitis B virus deoxyribonucleic acid (HBV-DNA).

ML is a statistical model in which a computer system performs tasks without using explicit instructions or reasoning[17]. Generally, Generally, ML algorithms are divided into two categories, supervised and unsupervised learning. The supervised learning involves constructing a mathematical model from a dataset, known as training data, which contains inputs and desired outputs, referred to as supervised signals. In this study, selected features were applied to various supervised ML models, including least absolute shrinkage selection operator (LASSO), support vector machine (SVM), K-Nearest Neighbour (KNN), logistic regression (LR), decision tree (DT) and random forest (RF). In contrast, unsupervised learning uses a dataset containing only inputs, with the system recognising patterns or structures in the data, such as grouping or clustering.

SVM

SVM is a classical algorithm for classification, designed to address binary or multi-classification problems. Its core objective is to identify an optimal hyperplane in the feature space that maximises the margin between classes. SVM performs well with small sample data, though its efficiency can be compromised when dealing with several variables. The “e1071” package in the R programming language was used to implement the SVM predictive model.

DT and RF

The use of the “rpart” package for decision analysis in the R programming language. DT is a classification and regression method named for its structure, where the rules and decisions resemble the trunk and branches of a tree. In DT, the predictor variable is represented as the root node, and the prediction result as the leaf node, with the path between them constituting the decision rule. The algorithm identifies the optimal variables and combinations to classify the data correctly. RF is an ensemble of DT that enhances the generalisation ability, accuracy, and stability of the model while reducing overfitting, thereby improving predictive performance.

KNN

The “knn” package in the R programming language was used to implement the KNN algorithm. The KNN algorithm measures the similarity between samples based on a distance metric. For a given sample to be classified, the algorithm calculates the distance between it and all neighbouring training points, identifying the k loci to classify the variable as a close class.

LR

The “rms” package in the R programming language was used for LR analysis. LR, implemented through scikit-learn, predicts binary outcomes based on weighted combinations of potential independent variables. A regression model was tested using an L2 penalty with the Newton-cg solver. This model serves as a baseline for quantitatively evaluating improvements in performance measures.

LASSO

The “glmnet” package in R was used to perform LASSO regression on the variables characterising the cohort. In LASSO regression, the absolute value of the eigen coefficients gradually decreases as the lambda value increases, and eventually converges to zero. Initially, as the lambda value increases, the bias percentage decreases; however, beyond a certain point, it gradually increases. The optimal lambda value corresponds to the minimum bias percentage.

Statistical analysis

Data were analysed using the SPSS (version 21.0, NY, USA) and R (version version 4.3.1). Categorical variables are expressed as frequencies, and comparisons between them were made using the chi-square test. Normally distributed continuous variables are presented as the mean ± standard deviation, and differences between normally distributed variables were compared using the t-test. Nonparametric tests were used to analyse differences between non-normally distributed and heterogeneous analysis of variance (ANOVA) measurement data. One-way ANOVA was employed for comparisons among multiple groups. Statistical differences were considered when the p < 0.05

Clinical characteristics

A total of 577 patients with HBV-ACLF were included in the study. Of these, 513 patients from the First Affiliated Hospital of Nanchang University were randomly assigned to the training group (n = 308) and the testing group (n = 205) in a 6:4 ratio. Additionally, 64 patients from Jiangxi Provincial People's Hospital were used as the external validation group. In the training cohort, there were 270 males and 38 females, with an average age of 44.75 ± 12.26 years, and 130 deaths within 90 days. The testing group comprised 172 males and 33 females, with an average age of 44.38 ± 11.65 years, and 79 deaths within 90 days. The validation group had 41 males and 23 females, with an average age of 44.88 ± 13.29 years, and 31 deaths within 90 days. No statistically significant differences were observed in the clinical characteristics between the training, testing, and validation cohorts, ensuring the reliability of the test and validation results (Table 1).

Table 1

HBV-ACLF patient characteristics in the training, test, and validation cohorts.
Characteristics	Training cohort (n = 308)	Test cohort (n = 205)	Validation cohort (n = 64)	P value
Gender				0.377
Male	270	172	53
Female	38	33	11
Age(years)	44.75 ± 12.26	44.38 ± 11.65	44.88 ± 13.29	0.931
Weight(kg)	64.60 ± 10.55	64.91 ± 10.62	64.73 ± 11.23	0.947
Hypertension				0.603
No	269	184	58
Yes	39	21	6
Diabetes				0.114
No	280	196	59
Yes	28	9	5
Cirrhosis				0.425
No	202	133	47
Yes	106	72	17
Portal hypertension				0.063
No	260	160	57
Yes	48	45	7
Hepatic encephalopathy				0.270
No	214	154	49
Yes	94	51	15
Hepatorenal syndrome				0.089
No	262	177	61
Yes	46	28	3
Gastrointestinal bleeding				0.265
No	277	181	61
Yes	31	24	3
Ascites				0.412
No	133	83	22
Yes	175	122	42
Spontaneous bacterial peritonitis				0.120
No	122	85	27
Yes	186	120	37
Electrolyte disturbance				0.903
No	176	113	36
Yes	132	92	28
Number of artificial liver treatments	1.00 (0, 2.00)	1.00 (0, 2.00)	0 (0, 2.00)	0.305
Glucocorticoid therapy				0.761
No	182	123	41
Yes	126	82	23
90-day survival				0.355
Death	130	79	31
Survival	178	126	33
WBC (×10⁹/L)	6.76 ± 3.30	7.24 ± 3.67	6.54 ± 2.88	0.190
HB (g/L)	129.50 ± 21.25	132.20 ± 19.34	129.14 ± 21.38	0.303
PLT (×10⁹/L)	126.14 ± 56.85	125.74 ± 53.88	127.27 ± 57.06	0.982
ALT (U/L)	721.59 ± 626.91	789.17 ± 714.56	606.66 ± 630.53	0.141
AST (U/L)	522.32 ± 513.32	576.76 ± 577.35	480.03 ± 444.61	0.346
TBIL (µmol/L)	293.70 (208.15, 402.03)	299.60 (208.85, 377.00)	262.35 (158.23, 345.25)	0.115
DBIL (µmol/L)	189.05 ± 91.84	181.20 ± 81.88	161.09 ± 103.24	0.072
ALB (g/L)	32.31 ± 4.14	32.40 ± 4.38	31.03 ± 4.45	0.190
SCR (µmol/L)	74.50 ± 32.21	75.13 ± 40.73	78.66 ± 44.79	0.715
Glu (mmol/L)	5.53 ± 4.05	4.65 ± 2.25	5.10 ± 3.34	0.733
TG (mmol/L)	1.10 ± 0.72	1.14 ± 0.67	1.18 ± 1.28	0.722
TC (mmol/L)	2.51 ± 0.91	2.67 ± 0.95	2.63 ± 1.32	0.174
PT (s)	23.30 (19.40, 30.70)	24.50 (19.75, 29.85)	24.90 (21.78, 29.13)	0.694
PTA (%)	31.75 (23.70, 40.20)	31.30 (24.00, 39.40)	36.00 (30.00, 38.00)	0.346
INR	2.13 (1.76, 2.79)	2.17 (1.73, 2.71)	2.16 (2.05, 2.51)	0.289
FIB (g/L)	1.33 ± 1.11	1.42 ± 1.57	1.85 ± 2.68	0.050
D-Dimer (mg/L)	2.48 ± 2.60	3.00 ± 5.48	2.16 ± 2.26	0.194
HBV-DNA (log10 IU/mL)	5.18 (3.44, 6.94)	5.58 (3.73, 6.85)	5.66 (1.00, 7.14)	0.613
WBC: white blood cell, HB: hemoglobin, PLT: platelets, ALT: alanine aminotransferase, AST: aspartate aminotransferase, TBIL: total bilirubin, DBIL: direct bilirubin, ALB: albumin, SCR :serum creatinine, Glu: glucose, TG: triglyceride, TC: total cholesterol, PT: prothrombin time, PTA: prothrombin activity, INR: international normalized ratio, FIB: fibrinogen, HBV-DNA :hepatitis B virus DNA.

LR-based predictive model

It was observed that the regression model achieved the best fit, with an Akaike Information Criterion value of 250, when the predictive model included the following variables: glutamyltransferase, SCR, diabetes, fibrinogen, TBIL, ALT, AST, HE, systolic blood pressure (SBP), HS, GIB, and INR as determined through LR analysis. Notably the area under the curve (AUC) value increased when these features were removed from the predictive model or supplemented with additional features (Fig. 2A). In the training cohort, the regression prediction model had an AUC value of 0.905, a sensitivity of 0.754, and a specificity of 0.902. Positive predictive value (PPV) and negative predictive value (NPV) were 0.841 and 0.842, respectively (Fig. 2B). In the testing cohort, the predictive model had an AUC of 0.928, with sensitivity and specificity values of 0.867 and 0.884, respectively, and PPV and NPV of 0.837 and 0.906, respectively (Fig. 2C). For the validation cohort, the predictive model had an AUC of 0.849, with sensitivity and specificity values of 0.645 and 0.909, respectively. PPV and NPV in the validation cohort were 0.870 and 0.732, respectively (Fig. 2D).

LASSO-based predictive model

The clinical characteristics of the training cohort were analysed using LASSO regression. As the lambda value increased, the absolute values of the characteristic coefficients gradually decreased, eventually converging to zero. The bias percentage followed a similar trend, first decreasing and then gradually increasing with higher lambda values. The key indicators identified through this process were GIB, age, SBP, HE, HS, HB, TBIL, PTA, and INR. A visual representation of the indicator screening process is provided in Figs. 2E and 2F. In the training cohort, the predictive model developed using LASSO regression analysis yielded an AUC value of 0.893, with a sensitivity of 0.810 specificity of 0.803, PPV of 0.739, and NPV of 0.860 (Fig. 2G). In the testing cohort, the predictive model demonstrated an AUC value of 0.909, with a sensitivity and specificity of 0.843 each, and PPV and NPV of 0.787 and 0.887, respectively (Fig. 2H). For the validation cohort, the model achieved an AUC value of 0.811, with a sensitivity of 0.677, specificity of 0.909, PPV of 0.875, and NPV of 0.750 (Fig. 2I).

SVM‑based predictive model

In the 90-day survival prediction model for HBV-ACLF constructed using SVM, it was observed that the model achieved the highest accuracy when the number of vectors was six (Fig. 3A). These six indicators included GIB, HE, HS, INR, PTA, and PT. In the training cohort, the predictive model had an AUC value of 0.881, with a sensitivity of 0.794, specificity of 0.842, PPV of 0.775, and NPV of 0.856 (Fig. 3C). In the testing cohort, the predictive model had an AUC of 0.865, sensitivity of 0.783, specificity of 0.843, PPV of 0.774, and NPV of 0.850, respectively (Fig. 3D). For the validation cohort, the predictive model achieved an AUC of 0.762, with a sensitivity of 0.581, specificity of 0.909, PPV of 0.857, and NPV of 0.700 (Fig. 3E).

KNN‑based predictive model

It was observed that the optimal kernel function for the prediction model was ‘triangle’, and the optimal k-value was 11 after optimising the hyperparameters of the KNN function (Fig. 3B). The prediction model was constructed based on these conditions. In the training cohort, the predictive model achieved an AUC value of 0.873, with a sensitivity of 0.694, specificity of 0.912, PPV of 0.843, and NPV of 0.814 (Fig. 3F). In the testing cohort, the predictive model yielded an AUC value of 0.918, with a sensitivity of 0.769, specificity of 0.895, PPV of 0.831, and NPV of 0.850 (Fig. 3G). For the validation cohort, the predictive model had an AUC value of 0.713, with a sensitivity of 0.645, specificity of 0.727, PPV of 0.690, and NPV of 0.686 (Fig. 3H).

DT-based predictive model

In the DT prognostic model, seven best splitting nodes were identified (Fig. 4A) and the variable features of the constructed predictive model were ranked according to their importance. The top 5 features were GIB, HS, PTA, HE, and HBV-DNA (Figure. 4B). These top five important features were selected to construct the DT prediction model. In the training cohort, the predictive model achieved an AUC value of 0.853, with a sensitivity of 0.770, specificity of 0.885, PPV of 0.822, and NPV of 0.848 (Figure. 4C). In the testing cohort, the predictive model recorded an AUC value of 0.828, with a sensitivity of 0.783, specificity of 0.835, PPV of 0.765, and NPV of 0.849 (Fig. 4D). For the validation cohort, the predictive model had an AUC value of 0.760, with a sensitivity of 0.581, specificity of 0.909, PPV of 0.857, and NPV of 0.700 (Fig. 4E).

RF-based predictive model

Through the RF prediction model, it was observed that the out-of-bag error reached its minimum value of 0.23 when the model incorporated 21 trees. Further analysis revealed that the error decreased to a minimum value of 0.186 when the number of split nodes was 71 (Figure. 5). Additionally, the feature importance among the 33 features in the model was analysed. It was found that GIB, HS, and INR significantly influenced prediction accuracy. Furthermore, GIB, HS, and HE played a significant role in reducing the Gini coefficient of the prediction model (Fig. 5B). In the training cohort, the predictive model demonstrated an AUC value of 1, with sensitivity, specificity, PPV, and NPV, each being 1 (Fig. 6C). In the testing cohort, the model had an AUC value of 0.912, with a sensitivity of 0.952, specificity of 0.802, PPV of 0.767, and NPV of 0.960 (Fig. 6D). In the validation cohort, the model achieved an AUC value of 0.728, with a sensitivity of 0.839, specificity of 0.545, PPV of 0.605, and NPV of 0.783 (Fig. 6E).

Comparison of the predictive abilities of various models

The predictive ability of the HBV-ACLF prognostic models constructed using multiple ML methods is presented in Table 2. In the training cohort, the RF prediction model outperformed the LASSO, LR, SVM, DT, and KNN prediction models, showing significantly higher AUC, sensitivity, specificity, PPV, and NPV. However, in the testing cohort, the performance of the RF model declined, with its AUC dropping to 0.728, and its accuracy, specificity, and PPV decreasing to 0.688, 0.545, and 0.655, respectively. In the training cohort, the LASSO prediction model exhibited the lowest PPV (0.739), while the KVV prediction model had the lowest sensitivity (0.694). In the testing and validation cohorts, the SVM and DT prediction models showed the lowest sensitivity, both recording a value of 0.581. Although the LR prediction model did not perform as well as the RF prediction model in the training cohort, it demonstrated better performance than the RF prediction model in the testing and validation cohorts. Notably, the LR prediction model exhibited more stable prediction performance across all cohorts, making it a reliable choice among the models evaluated.

Table 2

Comparison of the capabilities of ML prediction models for 90-day reversion in HBV-ACLF patients
Characteristics		LASSO	SVM	KNN	DT	RF
Training cohort
Accuracy	0.841	0.806	0.822	0.824	0.838	1.000
AUC	0.905	0.893	0.881	0.873	0.853	1.000
Sensitivity	0.754	0.810	0.794	0.694	0.770	1.000
Specificity	0.902	0.803	0.842	0.912	0.885	1.000
PPV	0.841	0.739	0.775	0.843	0.822	1.000
NPV	0.842	0.860	0.856	0.814	0.848	1.000
Test cohort
Accuracy	0.877	0.843	0.819	0.843	0.814	0.863
AUC	0.928	0.909	0.865	0.918	0.828	0.912
Sensitivity	0.867	0.843	0.783	0.769	0.783	0.952
Specificity	0.884	0.843	0.843	0.895	0.835	0.802
PPV	0.837	0.787	0.774	0.831	0.765	0.767
NPV	0.906	0.887	0.850	0.850	0.849	0.960
Validation cohort
Accuracy	0.781	0.797	0.750	0.688	0.750	0.688
AUC	0.849	0.811	0.762	0.713	0.760	0.728
Sensitivity	0.645	0.677	0.581	0.645	0.581	0.839
Specificity	0.909	0.909	0.909	0.727	0.909	0.545
PPV	0.870	0.875	0.857	0.690	0.857	0.605
NPV	0.732	0.750	0.700	0.686	0.700	0.783
LR: logistic regression; LASSO: least absolute shrinkage selection operator; SVM: support vector machine; DT: decision tree; RF: random forest; KNN: K nearest neighbor; AUC: area under the curve; PPV: positive predictive value; NPV: negative predictive value

Using a large two-center cohort, we developed ML prediction models that accurately predicted 90-day mortality in HBV-ACLF patients. the LR predictive model demonstrated the highest predictive efficacy and the best stability, and may be useful for clinicians' decision-making when selecting treatment strategies for HBV- ACLF patients.

Previous studies have demonstrated that conventional statistical modelling techniques are widely used and increasingly accepted for prognostic prediction in ACLF[18, 19]. However, these traditional methods have several limitations. First, their ability to process complex features is restricted[20]. Variable selection in conventional models often depends on the experience and prior knowledge of the researcher. For example, in predicting ACLF mortality, researchers might rely on commonly known liver function indicators and demographic predictors, potentially overlooking important underlying variables or including those with minimal impact on mortality. This subjectivity can reduce the predictive power of the model. Furthermore, when handling large, high-dimensional datasets—such as clinical, genetic, and imaging data—traditional statistical models struggle to efficiently filter out the most relevant predictive features[21]. Secondly, conventional statistical models lack the flexibility and generalisability required to adapt to diverse datasets[20, 22]. Different patient populations might exhibit varying data patterns, and the distribution of new data might differ from that used in model training. As a result, traditional models often perform poorly when applied to new or varied data, limiting their predictive accuracy[23]. ML predictive models use data-driven techniques and have been shown to either match or surpass the performance of traditional statistical methods. For instance, Zheng et al.[24] and Hou et al.[25] developed and validated artificial neural network-based models for predicting mortality in patients with HBV- ACLF, while Shi et al.[26] developed and validated a categorical regression tree model that outperformed the MELD score in short-term mortality prediction. In this study, it was found that ML prediction models similarly demonstrated strong predictive performance, with the LR prediction model showing the most stable performance.

Prediction models were developed using six ML tools: LR, SVM, DT, RF, KNN, and LASSO. Our results indicate that the RF prediction model performed better overall compared to LR, SVM, DT, KNN, and LASSO in the training cohort. But, the AUC of the RF model dropped to 0.728, with accuracy and PPV declining to 0.688 and 0.655 in the testing cohort, respectively. This suggests the RF model is prone to predicting a higher number of false positives, potentially due to the uneven distribution of dichotomous variables. A previous study on end-stage liver disease also demonstrated that the RF prediction model significantly outperformed the SVM and DT prediction models in terms of performance metrics[23]. Interestingly, while the LR model was less effective than RF in the training cohort, it outperformed RF in the testing and validation cohorts. This discrepancy might be attributed to the small sample size or substantial fluctuations in the data distribution within these cohorts. Additionally, the random sampling and feature selection in RF could result in substantial variability between decision trees, affecting the stability of the model.

Our study also found that not all ML prediction models are good predictors of prognosis for HBV-ACLF. Although the DT prediction model demonstrated relatively good accuracy across the training, testing, and validation cohorts, its AUC was lower compared to the five other models in the training and testing cohorts. One limitation of the DT model is its simplicity as a nonparametric classifier, which does account for interactions between variables during tree formation since only a single variable is used at each screening[27]. Similarly, the performance metrics of the SVM prediction model were comparable to the DT prediction model. The KNN prediction model exhibited good accuracy in the training and testing cohorts; however, its PPV and NPV in the validation cohort were 0.690 and 0.689, respectively, suggesting a relatively high error rate when predicting mortality in patients with HBV-ACLF. The LASSO predictive model, while demonstrating better predictive performance than some other models, still underperformed compared to the LR model. The poorer performance of the DT, SVM, KNN, and LASSO models could be attributed to their inherent challenges, such as susceptibility to overfitting, high computational complexity, and sensitivity to feature scaling.

A recent study demonstrated the use of ML in predicting prognosis after liver transplantation in patients with ACLF, emphasising its potential in this field[15]. Another study reported that ML could significantly aid in predicting cardiovascular mortality in patients with non-alcoholic steatohepatitis after transplantation[28]. With the development of technology, the application of ML is expanding beyond prognostic prediction. Chang et al. reported that ML models outperformed traditional scoring systems in identifying clinically significant stages of non-alcoholic fatty liver disease-related liver fibrosis and cirrhosis[29]. Moreover, ML has been shown to predict venous thrombosis in patients with cirrhosis[30]. Additionally, Zhou et al. demonstrated that ML can be used to model the grading of chronic hepatitis B inflammation, a key factor in clinical diagnosis and treatment[31].

Although this is a retrospective study, we used readily available features, investigated multiple ML methods to select the best ML model, and validated the model in an independent external validation set. Notably, even though there were different baseline patient characteristics and outcomes between the training and test cohorts and the external validation cohort, which may reflect different characteristics of patients in different regions, ML performed exceptionally well in these validation cohorts. Based on this, we believe that our ML prediction model can be widely used in future clinical practice, not only for prognostic prediction of HBV-ACLF patients, but also for the assessment of treatment efficacy in HBV-ACLF patients. Of course, this study has certain limitations. First, all variables were obtained at the time of admission, without considering the dynamic changes of these indicators over time. Secondly, the relatively small size of the external and testing cohorts limits the generalisability of the findings, highlighting the need for validation in larger, prospective studies. Additionally, improved interpretation and visualisation of the ML model are required. Finally, as the predictive model originated from a retrospective cohort, prospective validation is necessary before it can be considered for clinical application.

In summary, a widely applicable and simple ML method was developed to predict the 90-day prognosis of patients with HBV-ACLF. Among the six ML models evaluated, the LR predictive model demonstrated the highest predictive efficacy and the best stability, making it particularly suitable for integration into the clinical decision-making process.

ACLF: Acute chronic liver failure; HBV: hepatitis B virus; ML: machine learning; SVM: support vector machine; DT: decision tree; RF: random forest; KNN: K nearest neighbor; LASSO: least absolute shrinkage selection operator; LR: logistic regression; AUC: area under the curve; PPV: positive predictive value; NPV: negative predictive value; HE: hepatic encephalopathy; CTP: Child-Turcotte-Pugh; MELD: models for end-stage liver disease; APASL: Asia-Pacific Association for the Study of the Liver; TBIL: total bilirubin; PH: portal hypertension; HS: hepatorenal syndrome; GIB: gastrointestinal bleeding; ED: electrolyte disorders; HT: hormonal treatments; WBC: white blood cells; HB: hemoglobin; PLT: platelets; DBIL: direct bilirubin; ALB: albumin; SCR: serum creatinine; Glu: glucose; TG: triglyceride; TC: total cholesterol; PT: prothrombin time; PTA: prothrombin activity; INR: international normalized ratio; HBV-DNA: hepatitis B virus DNA

Ethics approval and consent to participate

This study was approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University (Approval IIT [2021]100). Informed consents of the patients were waived by the Ethics Committee of the First Affiliated Hospital of Nanchang University because of the retrospective and anonymous nature of the study.

Consent for publication

Not Applicable.

Competing interests

The authors report no conflicts of interest.

Data Availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Funding

This work was funded by the Natural Science Foundation (No. 20212ACB206010) of Jiangxi Province, China.

Authors’ contributions

JL: study design, data collection and manuscript writing; WTZ and YZ and QLX: data collection and analyzation; PS and AL and YNW: data collection and analyzation; JWZ and XPW: study design, data analyzation and manuscript final revision. The author(s) read and approved the final manuscript.

Acknowledgements

Not applicable.

Thanapirom K, Treeprasertsuk S, Choudhury A, Verma N, Dhiman RK, Al MM, Devarbhavi H, Shukla A et al (2024) Ammonia is associated with liver-related complications and predicts mortality in acute-on-chronic liver failure patients. Sci Rep 14(1):5796. https://10.1038/s41598-024-56401-x
Br VK, Sarin SK (2023) Acute-on-chronic liver failure: Terminology, mechanisms and management. Clin Mol Hepatol 29(3):670-689. https://10.3350/cmh.2022.0103
Schulz MS, Angeli P, Trebicka J (2024) Acute and non-acute decompensation of liver cirrhosis (47/130). Liver Int. https://10.1111/liv.15861
Moreau R, Gao B, Papp M, Banares R, Kamath PS (2021) Acute-on-chronic liver failure: A distinct clinical syndrome. J Hepatol 75 Suppl 1:S27-S35. https://10.1016/j.jhep.2020.11.047
Sarin SK, Choudhury A, Sharma MK, Maiwall R, Al MM, Rahman S, Saigal S, Saraf N et al (2019) Acute-on-chronic liver failure: consensus recommendations of the Asian Pacific association for the study of the liver (APASL): an update. Hepatol Int 13(4):353-390. https://10.1007/s12072-019-09946-3
Jalan R, Moreau R, Arroyo V (2020) Acute-on-Chronic Liver Failure. Reply. N Engl J Med 383(9):893-894. https://10.1056/NEJMc2023198
Mendizabal M, Ridruejo E, Pinero F, Anders M, Padilla M, Toro LG, Torre A, Montes P et al (2021) Comparison of different prognostic scores for patients with cirrhosis hospitalized with SARS-CoV-2 infection. Ann Hepatol 25:100350. https://10.1016/j.aohep.2021.100350
Choudhury A, Jindal A, Maiwall R, Sharma MK, Sharma BC, Pamecha V, Mahtab M, Rahman S et al (2017) Liver failure determines the outcome in patients of acute-on-chronic liver failure (ACLF): comparison of APASL ACLF research consortium (AARC) and CLIF-SOFA models. Hepatol Int 11(5):461-471. https://10.1007/s12072-017-9816-z
Kamath PS, Wiesner RH, Malinchoc M, Kremers W, Therneau TM, Kosberg CL, D'Amico G, Dickson ER et al (2001) A model to predict survival in patients with end-stage liver disease. Hepatology 33(2):464-470. https://10.1053/jhep.2001.22172
Lai JC, Covinsky KE, Dodge JL, Boscardin WJ, Segev DL, Roberts JP, Feng S (2017) Development of a novel frailty index to predict mortality in patients with end-stage liver disease. Hepatology 66(2):564-574. https://10.1002/hep.29219
Luca A, Angermayr B, Bertolini G, Koenig F, Vizzini G, Ploner M, Peck-Radosavljevic M, Gridelli B et al (2007) An integrated MELD model including serum sodium and age improves the prediction of early mortality in patients with cirrhosis. Liver Transpl 13(8):1174-1180. https://10.1002/lt.21197
Beam AL, Kohane IS (2018) Big Data and Machine Learning in Health Care. JAMA 319(13):1317-1318. https://10.1001/jama.2017.18391
Zhang Z, Ho KM, Hong Y (2019) Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit Care 23(1):112. https://10.1186/s13054-019-2411-z
Zhang Z, Wang J, Han W, Zhao L (2023) Using machine learning methods to predict 28-day mortality in patients with hepatic encephalopathy. BMC Gastroenterol 23(1):111. https://10.1186/s12876-023-02753-z
Ge J, Digitale JC, Fenton C, Mcculloch CE, Lai JC, Pletcher MJ, Gennatas ED (2023) Predicting post-liver transplant outcomes in patients with acute-on-chronic liver failure using Expert-Augmented Machine Learning. Am J Transplant 23(12):1908-1921. https://10.1016/j.ajt.2023.08.022
Verma N, Garg P, Valsan A, Roy A, Mishra S, Kaur P, Rathi S, De A et al (2024) Identification of four novel acute-on-chronic liver failure clusters with distinct clinical trajectories and mortality using machine learning methods. Aliment Pharmacol Ther. https://10.1111/apt.18274
Bishop CM (2013) Model-based machine learning. Philos Trans A Math Phys Eng Sci 371(1984):20120222. https://10.1098/rsta.2012.0222
Zhang Y, Shi K, Zhu B, Feng Y, Liu Y, Wang X (2024) Neutrophil Extracellular Trap Scores Predict 90-Day Mortality in Hepatitis B-Related Acute-on-Chronic Liver Failure. Biomedicines 12(9). https://10.3390/biomedicines12092048
Liu L, Huang C, Nie Y, Zhang Y, Zhou J, Zhu X (2024) Low platelet to high-density lipoprotein ratio predicts poor short-term prognosis in hepatitis B-related acute-on-chronic liver failure. BMC Infect Dis 24(1):888. https://10.1186/s12879-024-09769-0
Li R, Harshfield EL, Bell S, Burkhart M, Tuladhar AM, Hilal S, Tozer DJ, Chappell FM et al (2023) Predicting incident dementia in cerebral small vessel disease: comparison of machine learning and traditional statistical models. Cereb Circ Cogn Behav 5:100179. https://10.1016/j.cccb.2023.100179
Qiu S, Zhao Y, Hu J, Zhang Q, Wang L, Chen R, Cao Y, Liu F et al (2024) Predicting the 28-day prognosis of acute-on-chronic liver failure patients based on machine learning. Dig Liver Dis. https://10.1016/j.dld.2024.06.029
Choi SG, Oh M, Park DH, Lee B, Lee YH, Jee SH, Jeon JY (2023) Comparisons of the prediction models for undiagnosed diabetes between machine learning versus traditional statistical methods. Sci Rep 13(1):13101. https://10.1038/s41598-023-40170-0
Lin YJ, Chen RJ, Tang JH, Yu CS, Wu JL, Chen LC, Chang SS (2020) Machine-Learning Monitoring System for Predicting Mortality Among Patients With Noncancer End-Stage Liver Disease: Retrospective Study. JMIR Med Inform 8(10):e24305. https://10.2196/24305
Zheng MH, Shi KQ, Lin XF, Xiao DD, Chen LL, Liu WY, Fan YC, Chen YP (2013) A model to predict 3-month mortality risk of acute-on-chronic hepatitis B liver failure using artificial neural network. J Viral Hepat 20(4):248-255. https://10.1111/j.1365-2893.2012.01647.x
Hou Y, Zhang Q, Gao F, Mao D, Li J, Gong Z, Luo X, Chen G et al (2020) Artificial neural network-based models used for predicting 28- and 90-day mortality of patients with hepatitis B-associated acute-on-chronic liver failure. BMC Gastroenterol 20(1):75. https://10.1186/s12876-020-01191-5
Shi KQ, Zhou YY, Yan HD, Li H, Wu FL, Xie YY, Braddock M, Lin XY et al (2017) Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees. J Viral Hepat 24(2):132-140. https://10.1111/jvh.12617
Luo X, Wen X, Zhou M, Abusorrah A, Huang L (2022) Decision-Tree-Initialized Dendritic Neuron Model for Fast and Accurate Data Classification. IEEE Trans Neural Netw Learn Syst 33(9):4173-4183. https://10.1109/TNNLS.2021.3055991
Fatemi Y, Nikfar M, Oladazimi A, Zheng J, Hoy H, Ali H (2024) Machine Learning Approach for Cardiovascular Death Prediction among Nonalcoholic Steatohepatitis (NASH) Liver Transplant Recipients. Healthcare (Basel) 12(12). https://10.3390/healthcare12121165
Fan R, Yu N, Li G, Arshad T, Liu WY, Wong GL, Liang X, Chen Y et al (2024) Machine-learning model comprising five clinical indices and liver stiffness measurement can accurately identify MASLD-related liver fibrosis. Liver Int 44(3):749-759. https://10.1111/liv.15818
Li Y, Gao J, Zheng X, Nie G, Qin J, Wang H, He T, Wheelock A et al (2023) Diagnostic Prediction of portal vein thrombosis in chronic cirrhosis patients using data-driven precision medicine model. Brief Bioinform 25(1). https://10.1093/bib/bbad478
Zhou W, Ma Y, Zhang J, Hu J, Zhang M, Wang Y, Li Y, Wu L et al (2017) Predictive model for inflammation grades of chronic hepatitis B: Large-scale analysis of clinical parameters and gene expressions. Liver Int 37(11):1632-1641. https://10.1111/liv.13427

No competing interests reported.

Download PDF

Reviewers invited by journal
09 Nov, 2024
Editor invited by journal
22 Oct, 2024
Editor assigned by journal
21 Oct, 2024
Submission checks completed at journal
21 Oct, 2024
First submitted to journal
18 Oct, 2024

You are reading this latest preprint version

Predicting 90-day mortality in patients with HBV-ACLF using machine learning tools

Status:

Version 1

Abstract

Background

Method

Results

Conclusions

Figures

Introduction

Materials and Methods

Study cohort and data collection

Data preparation, feature selection, and model training

SVM

DT and RF

KNN

LR

LASSO

Statistical analysis

Results

Clinical characteristics

LR-based predictive model

LASSO-based predictive model

SVM‑based predictive model

KNN‑based predictive model

DT-based predictive model

RF-based predictive model

Comparison of the predictive abilities of various models

Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1