Study cohort and data collection
A total of 577 patients diagnosed with HBV-ACLF who were admitted to the First Affiliated Hospital of Nanchang University and Jiangxi Provincial People's Hospital between January 2016 and October 2022 were retrospectively included. Of these, 513 patients from the First Affiliated Hospital of Nanchang University were randomly divided into a training group of 308 patients and a testing group of 205 patients, based on a 6:4 ratio. An additional 64 patients from Jiangxi Provincial People's Hospital were used as an external validation group. Patients with HBV-ACLF were recruited based on the Asia–Pacific Association for the Study of the Liver criteria, which require hepatitis B surface antigen positivity for at least 6 months, a serum total bilirubin (TBIL) level of ≥ 5 mg/dL, an international normalised ratio (INR) level ≥ 1.5, or plasminogen activity of < 40%, along with the presence of ascites and/or HE within 4 weeks. The inclusion criteria were as follows: 1) age ≥ 16 years old; 2) diagnosis of HBV-ACLF; and 3) availability of clinical information. The exclusion criteria were as follows: 1) HBV-ACLF coexisting with other chronic liver diseases; 2) liver tumours or other malignancies; 3) severe chronic extra-hepatic diseases; 4) previous liver transplantation; and 5) human immunodeficiency virus infection or use of immunosuppressive drugs. This retrospective study was approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University and Jiangxi Provincial People's Hospital, and the need for informed consent was waived.
Data preparation, feature selection, and model training
The workflow for the development of ML models is illustrated in Fig. 1. Thirty-three characteristics of patients with HBV-ACLF were retrospectively collected for analysis, including gender, age, weight, hypertension, diabetes, cirrhosis, portal hypertension, HE, hepatorenal syndrome (HS), gastrointestinal bleeding (GIB), ascites, primary peritonitis, electrolyte disorders, number of artificial liver treatments, hormonal treatments, and admission laboratory tests, namely, white blood cell (WBC) count, haemoglobin (HB) levels, platelet (PLT) count, alanine aminotransferase (ALT) levels, aspartate aminotransferase (AST) levels, TBIL levels, direct bilirubin levels, albumin (ALB) levels, serum creatinine (SCR) levels, glucose levels, triglyceride (TL) levels, total cholesterol (TC) levels, prothrombin time (PT), prothrombin activity (PTA), INR, fibrinogen levels, D-dimer levels, hepatitis B virus deoxyribonucleic acid (HBV-DNA).
ML is a statistical model in which a computer system performs tasks without using explicit instructions or reasoning[17]. Generally, Generally, ML algorithms are divided into two categories, supervised and unsupervised learning. The supervised learning involves constructing a mathematical model from a dataset, known as training data, which contains inputs and desired outputs, referred to as supervised signals. In this study, selected features were applied to various supervised ML models, including least absolute shrinkage selection operator (LASSO), support vector machine (SVM), K-Nearest Neighbour (KNN), logistic regression (LR), decision tree (DT) and random forest (RF). In contrast, unsupervised learning uses a dataset containing only inputs, with the system recognising patterns or structures in the data, such as grouping or clustering.
SVM
SVM is a classical algorithm for classification, designed to address binary or multi-classification problems. Its core objective is to identify an optimal hyperplane in the feature space that maximises the margin between classes. SVM performs well with small sample data, though its efficiency can be compromised when dealing with several variables. The “e1071” package in the R programming language was used to implement the SVM predictive model.
DT and RF
The use of the “rpart” package for decision analysis in the R programming language. DT is a classification and regression method named for its structure, where the rules and decisions resemble the trunk and branches of a tree. In DT, the predictor variable is represented as the root node, and the prediction result as the leaf node, with the path between them constituting the decision rule. The algorithm identifies the optimal variables and combinations to classify the data correctly. RF is an ensemble of DT that enhances the generalisation ability, accuracy, and stability of the model while reducing overfitting, thereby improving predictive performance.
KNN
The “knn” package in the R programming language was used to implement the KNN algorithm. The KNN algorithm measures the similarity between samples based on a distance metric. For a given sample to be classified, the algorithm calculates the distance between it and all neighbouring training points, identifying the k loci to classify the variable as a close class.
LR
The “rms” package in the R programming language was used for LR analysis. LR, implemented through scikit-learn, predicts binary outcomes based on weighted combinations of potential independent variables. A regression model was tested using an L2 penalty with the Newton-cg solver. This model serves as a baseline for quantitatively evaluating improvements in performance measures.
LASSO
The “glmnet” package in R was used to perform LASSO regression on the variables characterising the cohort. In LASSO regression, the absolute value of the eigen coefficients gradually decreases as the lambda value increases, and eventually converges to zero. Initially, as the lambda value increases, the bias percentage decreases; however, beyond a certain point, it gradually increases. The optimal lambda value corresponds to the minimum bias percentage.
Statistical analysis
Data were analysed using the SPSS (version 21.0, NY, USA) and R (version version 4.3.1). Categorical variables are expressed as frequencies, and comparisons between them were made using the chi-square test. Normally distributed continuous variables are presented as the mean ± standard deviation, and differences between normally distributed variables were compared using the t-test. Nonparametric tests were used to analyse differences between non-normally distributed and heterogeneous analysis of variance (ANOVA) measurement data. One-way ANOVA was employed for comparisons among multiple groups. Statistical differences were considered when the p < 0.05