A machine learning-based prediction model for postoperative delirium in cardiac valve surgery using electronic health records

doi:10.21203/rs.3.rs-3223304/v1

Download PDF

Research Article

A machine learning-based prediction model for postoperative delirium in cardiac valve surgery using electronic health records

https://doi.org/10.21203/rs.3.rs-3223304/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 18 Jan, 2024

Read the published version in BMC Cardiovascular Disorders →

You are reading this latest preprint version

Background: Previous models for predicting delirium after cardiac surgery remained inadequate. This study aimed to develop and validate a machine learning-based prediction model for postoperative delirium (POD) in cardiac valve surgery patients.

Methods: The electronic medical information of the cardiac surgical intensive care unit (CSICU) was extracted from a tertiary and major referral hospital in southern China over 1 year, from June 2019 to June 2020. A total of 507 patients admitted to the CSICU after cardiac valve surgery were included in this study. Seven classical machine learning algorithms (logistic regression, support vector machine, K-nearest neighbors, Naïve Bayes classifier, perceptron, decision tree classifier, and random forest classifier) were used to develop delirium prediction models under full (n=32) and simple (n=20) feature sets, respectively.

Result: The area under the receiver operating characteristic curve (AUC) was higher under the full feature set (ranging from 0.61 to 0.85) than under the simple feature set (ranging from 0.31 to 0.76). Among all machine learning methods, the random forest classifier showed excellent potential for predicting delirium in patients using the full or simple feature set.

Conclusions: We established machine learning-based prediction models to predict POD in patients undergoing cardiac valve surgery. The random forest model has the best predictive performance in prediction and can help improve the prognosis of patients with POD.

prediction model

postoperative delirium

random forest classifier

machine learning

cardiac valve surgery

Postoperative delirium (POD) is a series of acute and paroxysmal neurocognitive disorders after cardiac surgery. Symptoms include inattention, disorganized thinking, and altered states of consciousness, which are not attributable to other known psychiatric conditions or neurological disorders[1–3]. In addition, POD occurs in three forms: hyperactive, hypoactive, and mixed delirium, which are often difficult to diagnose. The pathogenesis of POD remains unclear, and there are currently no effective diagnostic tools available to distinguish between ordinary agitation and POD. POD has been associated with increased mortality, prolonged hospitalization, long-term cognitive dysfunction, impaired quality of life, and increased healthcare costs[4–7]. As a result, healthcare providers and policymakers have recommended that POD prediction models be used at various stages of the clinical pathway to support decision-making[8].

Although ICU clinicians have focused on delirium in patients after cardiac surgery with cardiopulmonary bypass (CPB) as a unique contributor to neurocognitive dysfunction[6, 9, 10], current studies of prediction models often lump all cardiac surgeries together, ignoring the potential influence of cardiac disease and surgical modalities on the onset of delirium. This study focused mainly on the occurrence of delirium in VHD patients after cardiac surgery with CPB, as the number of such surgeries has increased over the past decades. Valve replacement or repair is the first option for VHD [11]. Early diagnosis methods have been developed to facilitate earlier valve replacement or repair in VHD patients. The increasing number of surgeries associated with VHD is attributed to its increasing incidence due to the aging population worldwide[12, 13]. Advanced age has been identified as a risk factor for delirium [7, 10, 14], and the prediction and management of delirium is particularly significant in the VHD surgical population.

An effective POD prediction model can greatly assist ICU clinicians in predicting patients at high risk of developing POD. This information can then be used to create better treatment plans and care protocols to help prevent the onset of delirium. However, few existing predictive models use machine learning algorithms, and many of these models have a high risk of bias[14, 15, 16, 17]. We aim to develop and validate a prediction model using machine learning tools that adhere to the standards set by the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement: a guideline specifically designed to guide the reporting of studies that create or validate multivariable prediction models[18].

Study Population

Data on clinical characteristics and outcomes of patients with VHD who underwent cardiac valve surgery with CPB were collected from the computerized database of the CSICU of Guangdong Provincial People's Hospital. The following criteria were used for inclusion in this study[14, 19, 20, 21]: (I) over 18 years of age; (II) definite diagnosis of valvular heart disease; (III) admission to the CSICU after cardiac valve surgery with CPB; (IV) no history of schizophrenia, psychosis, or neurodevelopmental malformations; (V) no diagnosis of blindness, deafness, or drug abuse or withdrawal; (VI) not in a terminal condition with an ICU stay more than 48 hours; (VII) delirium as assessed by trained paramedics using the Confusion Assessment Method for the ICU (CAM-ICU) and the Richmond Assessment Sedation Scale (RASS) score of − 3 to + 4; (VIII) no reoperation during the follow-up period; and (IX) clinical records completed at least 90%. Our study comprehensively addresses the ethical, legal, and regulatory norms and standards for conducting research involving clinical data in China, including relevant international norms and standards. Throughout the data collection phase, strict measures are implemented to protect privacy, ensuring that all information is anonymized.

Selected Variables

This study identified three categories (preoperative, intraoperative, and postoperative) and 32 potential risk factors for delirium based on previous research and the availability of clinical records in our electronic database. For convenience, these 32 variables are referred to as the full set of characteristics (Table 1). They include demographic characteristics, lifestyle characteristics, cognitive function, physical function, psychosocial factors, sensory function, pre-existing conditions, surgical procedures, and laboratory values[22]. Our sample size meets the events per variable (EPV) criteria[23]. Laboratory tests are performed as soon as patients are admitted to the CSICU, and the earliest results within 24 hours are selected for machine learning. The goal of this study is to predict the risk of developing delirium within 24 hours of admission to the CSICU and to take preventive measures in the early stages of the disease. In addition to the full feature set, we identified a simple feature set of 20 predictors with more readily available data, referred to as the simple feature set (Table 2). The features in the simple set were selected based on existing literature and ease of data collection, which may be useful for clinicians to make clinical judgments when examination results are insufficient. The selection of features was thoroughly reviewed by clinicians with expertise in cardiac surgery, delirium, anesthesiology, neurology, cardiopulmonary bypass, postoperative management, and nursing.

Table 1

Full feature set of 32 risk factors (Preoperative, Intraoperative, and Postoperative) for PCD
Patient characteristic Full feature set	Full dataset (N = 507)	Training dataset (N = 404)	Validation dataset (N = 103)
Delirium ( N (%) )	141 (28%)	110 (28%)	31 (28%)
Preoperative information
Female Sex ( N (%) )	300 (59%)	239 (59%)	61 (59%)
Education score (mean (SD)	0.5 (0.8)	0.5 (0.8)	0.6 (0.7)
Age (years) (mean (SD))	55.7 (13.5)	56.3 (13.3)	54.2 (13.9)
Height (cm) (mean (SD))	157.1 (29.6)	156.9 (28.4)	157.9 (29.2)
Weight (Kg) (mean (SD))	58.2 (11.3)	57.2 (12.4)	59.8 (10.9)
Alcohol abuse ( N (%) )	29 (6%)	23 (6%)	6 (6%)
Smoke abuse ( N (%) )	84 (17%)	65 (17%)	19 (17%)
Coronary heart disease ( N (%) )	34 (7%)	26 (7%)	8 (7%)
Cerebral infarction ( N (%) )	33 (7%)	26(7%)	7 (7%)
Diabetes ( N (%) )	29 (6%)	23 (6%)	6 (6%)
Hypertension ( N (%) )	87 (17%)	70 (17%)	17 (17%)
LVEF (%) (mean (SD)	58.7 (12.2)	58.8 (11.9)	58.4 (12.7)
Intraoperative information
CPB duration (mins) (mean (SD))	157.3 (27.9)	158.3 (28.2)	155.7 (28.4)
ACC duration (mins) (mean (SD))	98.0 (20.1)	97.6 (17.8)	98.9 (24.3)
Anesthesia duration(mins) (mean (SD))	259.0 (135.6)	262.0 (132.5)	251.4 (136.9)
Postoperative information
IABP employ ( N (%) )	66 (13%)	53 (13%)	13 (13%)
ECMO employ ( N (%) )	39 (8%)	30 (8%)	9 (8%)
WBC(mean (SD))	14.1 (2.6)	14.2(2.1)	13.6 (2.9)
NEUT (mean (SD))	11.7 (2.1)	11.8(1.4)	11.2 (2.5)
LY (mean (SD))	1.4 (0.2)	1.4 (0.3)	1.4 (0.1)
BUN (mean (SD))	8.7 (3.1)	8.6 (2.2)	8.9 (3.4)
Bilirubin (mean (SD))	25.3 (6.2)	24.9 (4.8)	26.1(6.9)
Serum creatinine (mean (SD))	106.4 (92.4)	103.4 (89.3)	108.2 (92.9)
Serum albumin (mean (SD))	35.4 (22.6)	35.5 (20.2)	35.0 (24.8)
PH (mean (SD))	7.3 (0.8)	7.1 (0.9)	7.9 (1.1)
PaCO₂ (mean (SD))	38.5 (15.1)	38.3 (17.9)	38.9 (13.2)
PaO₂ (mean (SD))	249.0 (102.9)	241.0 (103.2)	263.0 (97.8)
Na (mean (SD))	139.1 (19.0)	134.1 (18.1)	146.1 (19.9)
K (mean (SD))	3.9 (0.7)	3.7 (0.6)	4.3 (0.9)
Glu (mean (SD))	9.1 (3.4)	9.0 (3.7)	9.6(3.6)
Pain score (mean (SD))	2.2 (0.4)	2.1 (0.3)	2.7 (0.5)

Table 2

Simple feature set of 20 risk factors (Preoperative, Intraoperative, and Postoperative) for PCD
Patient characteristic Full feature set	Full dataset (N = 507)	Training dataset (N = 404)	Validation dataset (N = 103)
Delirium ( N (%) )	141 (28%)	112(28%)	29(28%)
Preoperative information
Female Sex ( N (%) )	300 (59%)	240 (59%)	60 (59%)
Education score (mean (SD)	0.5 (0.8)	0.5 (0.8)	0.6 (0.7)
Age (years) (mean (SD))	55.7 (13.5)	56.5 (13.1)	54.1 (13.8)
Height (cm) (mean (SD))	157.1 (29.6)	156.7 (28.4)	157.7(29.2)
Weight (Kg) (mean (SD))	58.2 (11.3)	57.4 (12.4)	59.7 (10.9)
Alcohol abuse ( N (%) )	29 (6%)	24 (6%)	5 (6%)
Smoke abuse ( N (%) )	84 (17%)	63 (17%)	21 (17%)
LVEF (%) (mean (SD)	58.7 (12.2)	58.8 (11.9)	58.4 (12.7)
Intraoperative information
CPB duration (mins) (mean (SD))	157.3 (27.9)	158.4 (28.2)	155.6 (28.4)
ACC duration (mins) (mean (SD))	98.0 (20.1)	97.5(17.8)	98.9 (24.3)
Postoperative information
IABP employ ( N (%) )	66 (13%)	54 (13%)	12 (13%)
LY (mean (SD))	1.4 (0.2)	1.4 (0.3)	1.4 (0.1)
BUN (mean (SD))	8.7 (3.1)	8.6 (2.1)	8.9 (3.5)
Bilirubin (mean (SD))	25.3 (6.2)	24.9 (4.9)	26.2(6.6)
Serum creatinine (mean (SD))	106.4 (92.4)	103.3 (89.1)	108.0 (93.4)
serum albumin (mean (SD))	35.4 (22.6)	35.6 (20.4)	35.0 (24.6)
PH (mean (SD))	7.3 (0.8)	7.1 (0.9)	7.9 (1.1)
PaCO₂ (mean (SD))	38.5 (15.1)	38.2 (17.8)	38.9 (13.4)
Glu (mean (SD))	9.1 (3.4)	9.0 (3.8)	9.6(3.2)

Assessment of Delirium

POD is a series of acute and fluctuating cognitive disturbances that commonly occur between postoperative days 2 and 5 after open-heart surgery[24]. Delirium assessment was performed twice daily by different paramedics in the CSICU for up to 7 days until a positive assessment result was obtained. Conversely, no change in mental state within seven days was considered a negative result. Patients who left the CSICU before the 7th day were assessed by trained paramedics in the wards.

First, the Richmond Assessment Sedation Scale (RASS score) was used to assess the sedation level of our patients. As previously described, patients with an RASS score of -4 (defined as comatose) or -5 (indicating no physical/verbal response) were excluded because they could not be screened using the Confusion Assessment Method (CAM)[21, 25]. After initial screening, delirium was assessed using the Confusion Assessment Method (CAM), the most widely used standardized bedside diagnostic tool, which has been shown in previous studies to be highly sensitive (94–100%) and specific (90–95%)[26, 27]. Patients were defined as positive when the following required diagnostic components were present: 1) an acute change in mental status over some time; 2) a decrease in concentration; 3) a change in level of consciousness; and 4) confusion in thinking structure. As long as the patient had either feature 1 and 2 or feature 3 and 4, the diagnosis of delirium was definitive[27, 28].

Machine Learning Algorithms and Analysis

A total of 507 patients who met our inclusion criteria were analyzed retrospectively. The stratified shuffle split method divided the full data into a training set (80%, N = 405) and a validation set (20%, N = 103). As the proportion of delirium in the original dataset was 28%, the stratified shuffle split method was chosen to classify the data to prevent significant variation in the distribution ratio of delirium outcome between the training set and the validation sets[14, 29]. To eliminate the impact of data variation on the final model, normalization of all the data was performed before machine learning.

All computations and analyses were performed using Python version (3.8.12) and databases in Python such as pandas[30], numpy[31], random[32], seaborn[33], matplotlib[33], and sklearn[34], were applied. For both the full feature set and simple feature sets, seven classic machine learning algorithms (logistic regression[35, 36], support vector machine[37], K-nearest neighbors [38], Naïve Bayes (GaussianNB)[39], perceptron[40], decision tree classifier[39], random forest classifier[41]) were used to develop prediction models of delirium. The choice of machine learning algorithms was based on the sample size and number of features. Test scores and ROC curves were used to evaluate the predictive efficiency of the models in the validation set.

Participants characteristics

The selection of participants for the study is described in detail in Fig. 1. A total of 507 patients who met all inclusion criteria were enrolled between 30th June 2019 and 30th June 2020. The incidence of delirium, which reached up to 28% in our study, suggested that an accurate predictive model for patients with VHD was essential. To ensure the accuracy of the data, detailed medical records on delirium were reviewed by expert clinicians to collect clinical information on diagnosis, surgical procedure, left ventricular ejection fraction (LVEF), duration of anesthesia, duration of CPB, abnormal laboratory results, development of delirium, etc.

Following the grouping methods of previous studies, the enrolled cases were randomly divided into a training group (80%, N = 405) and a validation group (20%, N = 103). The baseline characteristics of all cases are described in Table 1. Data are presented as mean and standard deviation (SD) for continuous variables and as percentages for dichotomous variables. Missing data were multiply imputed using chain equations[42]. As for the assessment of educational attainment, we developed a scoring scheme by converting multi-categorical variables into continuous variables. According to our admission scoring system, the educational level of patients was divided into three categories: junior high school education and below (score as 0); high school education or undergraduate degree (score as 1); postgraduate degree and above (score as 2). The average education score of the overall samples was 0.5, indicating a low level of education among our participants. Some researchers considered low educational level as a risk factor for the development of delirium due to the lack of mental training activities and insufficient cognitive reserve[43, 44].

The characteristics of our participants were associated with the surgical procedure and the pathogenetic features of VHD. For example, our participants were predominantly female (59%), and the average age (55.7 years) was below 70 years, as VHD is more common in females and middle-aged individuals, which differs from other models that do not differentiate the primary disease[15]. The average cardiopulmonary bypass (CPB) duration (157.3 minutes versus 198.34 minutes), aortic cross-clamping duration (98.0 minutes versus 114.86 minutes), and anesthesia duration (259.0 minutes versus 476.91 minutes) were much shorter in this study compared with the previous POD study in patients with type A aortic dissection (AAD)[45]. In addition to the conventional indicators of postoperative laboratory tests, the postoperative use of IABP/ECMO within the delirium assessment period was also included in this study. IABP/ECMO is associated with hemodynamic instability and internal environmental disturbances, which may lead to the development of delirium[46, 47, 48]. In this study, we included the pain score assessed by the Digital Evaluation Scale (NRS) as a predictor of delirium outcome[49]. Poor pain management caused by inadequate analgesia or excessive sedation may trigger delirium after surgery. The average pain score of 2.2 points in this study indicates mild pain (no sleep effects) after surgery.

Model development

Table 1 and Table 2 describe 32 complete and 20 selected characteristics, respectively, including preoperative, intraoperative, and postoperative information. The training and validation sets were randomly selected, and the distribution of delirium was balanced between them, indicating that the variation between the two data sets was due to chance rather than grouping. In both the full feature set and the simple feature set, the proportion of delirium remained consistent (with minor variations due to rounding) across the full sample (28%), training sample (28%), and validation sample (28%).

The specific development process is shown in Fig. 2. For the training group, seven classical machine learning algorithms (Logistic Regression[35, 36], Support Vector Machine[37], K-nearest neighbors[38], Naïve Bayes (GaussianNB)[39], Perceptron[40], Decision Tree Classifier[39], Random Forest Classifier[41]) were used to develop prediction models for delirium under both the full feature set and the simple feature set. These algorithmic models were then validated in the validation group to assess their performance.

Model performance

The final validation result of the predictive modeling was determined by machine learning algorithms, as described in Table 3. In this table, seven prediction models developed by different machine learning algorithms were ranked according to their prediction efficiency. We used the test score and the area under the receiver operating characteristic curve (AUC) to assess the stability and accuracy of the delirium prediction models. Briefly, the higher the test score in a random case (internal or external validation dataset), the better the potential predictive efficiency of a prediction model. Using the full feature set, the seven models were ranked from highest to lowest by test score as Random Forest Classifier, Logistic Regression, Support Vector Machine, K-nearest Neighbors, Naïve Bayes (GaussianNB), Decision Tree Classifier, and Perceptron. Similarly, in the selected feature set, the arrangement order was Random Forest Classifier, Support Vector Machine, Logistic Regression, Naïve Bayes (GaussianNB), Decision Tree Classifier, K-nearest Neighbors, and Perceptron. Most machine learning algorithms performed better with a full feature set than with a simple feature set, suggesting that models using a full feature set had a better potential for predicting delirium cases. In addition, the random forest classifier had the highest training and test scores regardless of the feature set used, indicating its excellent potential for predicting delirium cases.

Table 3

Validation result of the prediction modeling under Full feature set and Simple feature set
Full feature set (n = 39)	Training score	Test score	ROC
Random Forest	100	82.35	0.86
logistic Regression	85.68	79.41	0.73
Support Vector Machine	81.98	74.51	0.78
K-nearest neighbors	81.48	71.57	0.62
Naïve Bayes	70.12	69.61	0.61
Decision tree	70.01	69.61	0.76
Perceptron	67.9	58.82	0.65
Simple feature set (n = 20)	Training score	Test score	ROC
Random Forest	100	75.49	0.76
Support Vector Machine	79.51	73.53	0.64
logistic Regression	74.57	72.55	0.64
Naïve Bayes	69.88	71.57	0.51
Decision tree	68.88	71.57	0.72
K-nearest neighbors	79.75	69.61	0.5
Perceptron	62.22	57.84	0.31

Receiver operating characteristic (ROC) curves for prediction models, generated by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings, are shown in Fig. 3. A (full feature set) and Fig. 3. B (simple feature set), respectively[50]. Overall, the ROC curves in Fig. 3. A (full feature set) is closer to the upper left corner and located on the left of the main diagonal, indicating more significant improvement and more robust prediction for most algorithms with the full feature set. The area under the ROC curve (AUC) is a gold standard for evaluating the quality of classifiers (predictive models). Among all the predictive approaches, the highest AUC (0.86) was observed in the random forest classifier with the full feature set. Typically, an AUC greater than 0.85 indicates a superior predictive value of a predictive approach; therefore, we can conclude that the random forest classifier can accurately predict cases of delirium comparatively well. Even with the simple feature set (Fig. 3. B), the random forest classifier (AUC = 0.76) shows a relatively excellent predictive value compared to other classifiers. The decision tree classifier, a simpler machine learning system than the random forest classifier, is prone to overfitting during machine learning. Although increasing the number of restrictions in the arithmetic can partially reduce overfitting, the availability of features may be sacrificed at the same time[51]. The decision tree classifier had an AUC of 0.76 with the full feature set and 0.72 with the simple feature set, indicating a relatively stable performance in delirium prediction. The support vector machine, a supervised learning method, achieved better statistical results in the case of small sample size by improving its generalization ability[52, 53]. The performance of the support vector machine (AUC = 0.78) is slightly lower than that of the random forest classifier under the full feature set. Notably, the AUC under the simple feature set is 0.64, indicating a relatively unstable performance. As for the classical logistic regression, a method mainly used in previous clinical studies[54, 55], it had an AUC of 0.73 under the full feature set and an AUC of 0.64 under the simple feature set, considered mediocre for model building. Furthermore, the AUC of K-nearest Neighbors (0.50), Naïve Bayes (0.51), and perceptron (0.31) under the simple feature set are less than or around 0.5, suggesting that their prediction accuracy is equivalent to random guessing and thus has no predictive value. Even with the full feature set, these classifiers have AUCs between 0.61 and 0.65, illustrating their limited predictive ability.

In conclusion, the random forest classifier is a prediction model with excellent accuracy and stability.

Interest in using machine learning algorithms for risk assessment and clinical outcome prediction has grown due to the advancement of artificial intelligence (AI) software and the reliability of AI algorithms. In this study, machine learning applications are used to create an efficient prediction model. Machine learning models can make the most accurate predictions compared to traditional statistical models used to determine the relationship between variables. The random forest classifier, a common machine learning technique, outperforms other current algorithms in terms of accuracy and produces an internal estimate of its generalization error during training [51]. In this retrospective study, the random forest classifier with a full feature set achieved an AUC of 0.85, Indicating excellent performance in delirium prediction. Even with a simpler feature set, the random forest classifier achieved an AUC of 0.76, providing a relatively high predictive value, and allowing it to be used for initial screening in some situations. Compared to neural networks, a widely used method with exceptional information processing capability, the random forest classifier produces satisfactory results with a much smaller sample size, making it a potentially exciting alternative for the future[51].

We developed the first machine learning-based prediction model for POD outcomes in patients with VHD. The incidence of delirium in VHD patients after valve surgery can reach up to 28% (n = 141), necessitating a method to predict POD and aid in clinical prevention. Compared to previous predictive models for postoperative delirium after cardiac surgery, this study explicitly focuses on the POD of VHD patients in the ICU, taking advantage of the large number of valve surgeries performed annually and the presence of the CSICU in our hospital. Furthermore, all participants underwent elective valve surgery with cardiopulmonary bypass [15]. We mainly set a period of seven days to assess delirium to increase the rigor, because POD is an acute postoperative syndrome that usually develops 2–5 days after surgery. This differs from the later delirium, which may be seen in longer ICU stays, and is often more complex in its causes of morbidity and therefore worthy of separate discussion. Many previous studies on prediction models either ignore the setting of a prediction time point or have problems with the interval between the prediction time point and the outcome[24]. We developed a POD prediction model for VHD patients admitted to the CSICU with valve surgery. The predictive time point is set within 24 hours of ICU admission, allowing us to perform risk assessment and outcome prediction based on preoperative assessments, intraoperative information, and postoperative initial laboratory results. Care plans and treatments are then adjusted to address the higher risk of delirium.

The prediction models used depend on the primary disease, population characteristics, and type of surgery, which may help improve the management and prevention of postoperative delirium. Refining our delirium prediction models is an important part of advancing the field. This study explored the combination of machine learning and clinical applications, comparing different classifiers to verify that the random forest classifier is a reliable approach to building prediction models, potentially providing a reference for research with a small sample size.

Our study is expected to advance AI healthcare. The existing delirium prediction models are less specific, so we provided experience in model refinement. The prediction models based on other primary diseases and types of surgery are promising to be integrated in the future. Accompanied by relevant software development and updating of the ICU monitoring system with well-trained prediction models, automatic alerts for delirium can be sent to clinicians immediately, reducing their workload.

There are limitations to this study. The sample size limits our choice of machine learning algorithms, and there was no external validation. We are considering increasing the sample size in future studies and performing external validation. The results and model parameters in this study will be helpful in our future research and provide a reference for similar studies.

We developed the first POD prediction approach for CSICU-admitted VHD patients, which may be promising for automatically alerting ICU staff in the early stages of delirium. With an AUC of 0.86 under a full feature set in the random forest classifier, this machine learning-based model can be widely used for accurate clinical decision-making. As for the simpler prediction approach with an AUC of 0.76, training on a simple feature set may be possible for the initial screening of patients with insufficient information.

Ethics statement

The Ethics Committee of Guangdong Provincial People's Hospital [No. GDREC2019648H(R1)] has determined that the research study titled "A machine learning-based prediction model for postoperative delirium in cardiac valve surgery using electronic health records" meets the criteria for a waiver of informed consent as outlined in the Declaration of Helsinki. The Committee has conducted a comprehensive review of the study's procedures and potential risks and has confirmed that there are no known risks associated with participation. Considering the retrospective design of the study, obtaining informed consent from each patient is impracticable. Following a thorough evaluation, the Committee has concluded that granting this waiver will not adversely affect the rights and welfare of the participants. Consequently, the Ethics Committee of Guangdong Provincial People's Hospital [No. GDREC2019648H(R1)] has waived the requirement for obtaining informed consent in this study.

Consent for publication

Not applicable. The study did not involve identifiable information or images of participants.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author (Prof. Liming LEI) upon reasonable request.

Conflict of Interest Statement

The authors declare that they have no Conflict of Interest.

Financial Disclosure Statement

There are no financial conflicts of interest to disclose.

Fundings

This work was supported by the National Natural Science Funds of China (Grant No. 82270308 and No. 81900285), the Science and Technology Planning Project of Guangdong Province (Grant No. 2020B1111170011), Science and Technology Program of Guangzhou, China (Grant No. 202102080379 and No. 202206010049), and Guangdong peak project (DFJH201802).

Authors' contributions

Qiuying Li, Jiaxin Li, and Liming Lei contributed to the study conception and design. Data collection and analysis were performed by Jiansong Chen, Xu Zhao, and Yamin Song. The work was carried out under the supervision of Jian Zhuang, Guoping Zhong, and Liming Lei. The first draft of the manuscript was written by Qiuying Li and Jiaxin Li. The manuscript was reviewed and modified by all authors. All authors have read and approved the final manuscript.

Jones D, Hodgson CL, Shehabi Y, Reade MC. Reducing confusion about post-cardiotomy delirium. Crit Care Resusc. 2017;19(1):5–8.
Sockalingam S, Parekh N, Bogoch II, Sun J, Mahtani R, Beach C, Bollegalla N, Turzanski S, Seto E, Kim J, Dulay P, Scarrow S, Bhalerao S. Delirium in the postoperative cardiac patient: a review. J Card Surg. 2005;20(6):560–7. https://doi.org/10.1111/j.1540-8191.2005.00134.x.
Tan MC, Felde A, Kuskowski M, Ward H, Kelly RF, Adabag AS, Dysken M. Incidence and predictors of post-cardiotomy delirium. Am J Geriatr Psychiatry. 2008;16(7):575–83. https://doi.org/10.1097/JGP.0b013e318172b418.
Koster S, Hensens AG, van der Palen J. The long-term cognitive and functional outcomes of postoperative delirium after cardiac surgery. Ann Thorac Surg. 2009;87(5):1469–74. https://doi.org/10.1016/j.athoracsur.2009.02.080.
Mangusan RF, Hooper V, Denslow SA, Travis L. Outcomes associated with postoperative delirium after cardiac surgery. Am J Crit Care. 2015;24(2):156–63. https://doi.org/10.4037/ajcc2015137.
Rengel KF, Pandharipande PP, Hughes CG. Postoperative delirium, Presse Med. 47 (4 Pt 2) (2018) e53–e64, https://doi.org/10.1016/j.lpm.2018.03.012.
Tse L, Schwarz SK, Bowering JB, Moore RL, Burns KD, Richford CM, Osborn JA, Barr AM. Pharmacological risk factors for delirium after cardiac surgery: a review. Curr Neuropharmacol. 2012;10(3):181–96. https://doi.org/10.2174/157015912803217332.
Goff DJ, Lloyd-Jones DM, Bennett G, Coady S, D'Agostino RB, Gibbons R, Greenland P, Lackland DT, Levy D, O'Donnell CJ, Robinson JG, Schwartz JS, Shero ST, Smith SJ, Sorlie P, Stone NJ, Wilson PW, Jordan HS, Nevo L, Wnek J, Anderson JL, Halperin JL, Albert NM, Bozkurt B, Brindis RG, Curtis LH, Demets D, Hochman JS, Kovacs RJ, Ohman EM, Pressler SJ, Sellke FW, Shen WK, Smith SJ, Tomaselli GF. 2013 acc/aha guideline on the assessment of cardiovascular risk: a report of the American college of cardiology/american heart association task force on practice guidelines. Circulation. 2014;129(25 Suppl 2):49–S73. https://doi.org/10.1161/01.cir.0000437741.48606.98.
Esper SA, Subramaniam K, Tanaka KA. Pathophysiology of cardiopulmonary bypass: current strategies for the prevention and treatment of anemia, coagulopathy, and organ dysfunction. Semin Cardiothorac Vasc Anesth. 2014;18(2):161–76. https://doi.org/10.1177/1089253214532375.
Gosselt AN, Slooter AJ, Boere PR, Zaal IJ. Risk factors for delirium after on-pump cardiac surgery: a systematic review. Crit Care. 2015;19(1):346. https://doi.org/10.1186/s13054-015-1060-0.
Otto CM, Nishimura RA, Bonow RO, Carabello BA, Erwin JR, Gentile F, Jneid H, Krieger EV, Mack M, Mcleod C, O'Gara PT, Rigolin VH, Sundt TR, Thompson A, Toly C. 2020 acc/aha guideline for the management of patients with valvular heart disease: executive summary: a report of the American college of cardiology/american heart association joint committee on clinical practice guidelines. Circulation. 2021;143(5):e35–e71. https://doi.org/10.1161/CIR.0000000000000932.
Davierwala PM. Valvular heart surgery: evaluating the past to enlighten the future. Eur J Cardiothorac Surg. 2014;46(3):398–9. https://doi.org/10.1093/ejcts/ezu121.
Kim DH, Kang DH. Early surgery in valvular heart disease. Korean Circ J. 2018;48(11):964–73. https://doi.org/10.4070/kcj.2018.0308.
Racine AM, Tommet D, D'Aquila ML, Fong TG, Gou Y, Tabloski PA, Metzger ED, Hshieh TT, Schmitt EM, Vasunilashorn SM, Kunze L, Vlassakov K, Abdeen A, Lange J, Earp B, Dickerson BC, Marcantonio ER, Steingrimsson J, Travison TG, Inouye SK, Jones RN. Machine learning to develop and internally validate a predictive model for postoperative delirium in a prospective, observational clinical cohort study of older surgical patients. J Gen Intern Med. 2021;36(2):265–73. https://doi.org/10.1007/s11606-020-06238-7.
Cai S, Li J, Gao J, Pan W, Zhang Y. Prediction models for postoperative delirium after cardiac surgery: systematic review and critical appraisal. Int J Nurs Stud. 2022;136:104340. https://doi.org/10.1016/j.ijnurstu.2022.104340.
Corradi JP, Thompson S, Mather JF, Waszynski CM, Dicks RS. Prediction of incident delirium using a random forest classifier. J Med Syst. 2018;42(12):261. https://doi.org/10.1007/s10916-018-1109-0.
Wong A, Young AT, Liang AS, Gonzales R, Douglas VC, Hadley D. Development and validation of an electronic health record-based machine learning model to estimate delirium risk in newly hospitalized patients without known cognitive impairment. JAMA Netw Open. 2018;1(4):e181018. https://doi.org/10.1001/jamanetworkopen.2018.1018.
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement. BMJ. 2015;350:g7594. https://doi.org/10.1136/bmj.g7594.
Devlin JW, Skrobik Y, Gelinas C, Needham DM, Slooter A, Pandharipande PP, Watson PL, Weinhouse GL, Nunnally ME, Rochwerg B, Balas MC, van den Boogaard M, Bosma KJ, Brummel NE, Chanques G, Denehy L, Drouot X, Fraser GL, Harris JE, Joffe AM, Kho ME, Kress JP, Lanphere JA, Mckinley S, Neufeld KJ, Pisani MA, Payen JF, Pun BT, Puntillo KA, Riker RR, Robinson B, Shehabi Y, Szumita PM, Winkelman C, Centofanti JE, Price C, Nikayin S, Misak CJ, Flood PD, Kiedrowski K, Alhazzani W. Clinical practice guidelines for the prevention and management of pain, agitation/sedation, delirium, immobility, and sleep disruption in adult patients in the ICU, Crit. Care Med. 2018;46(9):e825–73. https://doi.org/10.1097/CCM.0000000000003299.
Hughes CG, Boncyk CS, Culley DJ, Fleisher LA, Leung JM, Mcdonagh DL, Gan TJ, Mcevoy MD, Miller TE. American Society for Enhanced Recovery and perioperative quality initiative joint consensus statement on postoperative delirium prevention. Anesth Analg. 2020;130(6):1572–90. https://doi.org/10.1213/ANE.0000000000004641.
Martin BJ, Buth KJ, Arora RC, Baskett RJ. Delirium: a cause for concern beyond the immediate postoperative period. Ann Thorac Surg. 2012;93(4):1114–20. https://doi.org/10.1016/j.athoracsur.2011.09.011.
Schmitt EM, Marcantonio ER, Alsop DC, Jones RN, Rogers SJ, Fong TG, Metzger E, Inouye SK. Novel risk markers and long-term outcomes of delirium: the successful aging after elective surgery (sages) study design and methods. J Am Med Dir Assoc. 2012;13(9):811–8. https://doi.org/10.1016/j.jamda.2012.08.004.
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49(12):1373–9. https://doi.org/10.1016/s0895-4356(96)00236-3.
Jin Z, Hu J, Ma D. Postoperative delirium: perioperative assessment, risk reduction, and management. Br J Anaesth. 2020;125(4):492–504. https://doi.org/10.1016/j.bja.2020.06.063.
Chaput AJ, Bryson GL. Postoperative delirium: risk factors and management: continuing professional development. Can J Anaesth. 2012;59(3):304–20. https://doi.org/10.1007/s12630-011-9658-4.
Wei LA, Fearing MA, Sternberg EJ, Inouye SK. The confusion assessment method: a systematic review of current usage. J Am Geriatr Soc. 2008;56(5):823–30. https://doi.org/10.1111/j.1532-5415.2008.01674.x.
Wong CL, Holroyd-Leduc J, Simel DL, Straus SE. Does this patient have delirium?: Value of bedside instruments. JAMA. 2010;304(7):779–86. https://doi.org/10.1001/jama.2010.1182.
Rieck KM, Pagali S, Miller DM. Delirium in hospitalized older adults, Hosp Pract (1995) 48 (sup1) (2020) 3–16, https://doi.org/10.1080/21548331.2019.1709359.
Liu J, Wong Z, So HY, Tsui KL. Evaluating resampling methods and structured features to improve fall incident report identification by the severity level. J Am Med Inform Assoc. 2021;28(8):1756–64. https://doi.org/10.1093/jamia/ocab048.
van Ijzendoorn DG, Glass K, Quackenbush J, Kuijjer ML. Pypanda: a python package for gene regulatory network reconstruction. Bioinformatics. 2016;32(21):3363–5. https://doi.org/10.1093/bioinformatics/btw422.
Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, Del RJ, Wiebe M, Peterson P, Gerard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE. Array programming with numpy. Nature. 2020;585(7825):357–62. https://doi.org/10.1038/s41586-020-2649-2.
Ivanko E, Chernoskutov M. The random plots graph generation model for studying systems with unknown connection structures. Entropy (Basel). 2022;24(2). https://doi.org/10.3390/e24020297.
Weiss CJ. Visualizing protein big data using python and jupyter notebooks. Biochem Mol Biol Educ. 2022;50(5):431–6. https://doi.org/10.1002/bmb.21621.
Yang F, Wang X, Ma H, Li J. Transformers-sklearn: a toolkit for medical language understanding with transformer-based models. BMC Med Inform Decis Mak. 2021;21:90. https://doi.org/10.1186/s12911-021-01459-0. (Suppl 2).
Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards, Crit. Care Med. 2016;44(2):368–74. https://doi.org/10.1097/CCM.0000000000001571.
Song X, Liu X, Liu F, Wang C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: a systematic review and meta-analysis. Int J Med Inform. 2021;151:104484. https://doi.org/10.1016/j.ijmedinf.2021.104484.
K A, Vincent P, Srinivasan K, Chang CY. Deep learning assisted neonatal cry classification via support vector machine models. Front Public Health. 2021;9:670352. https://doi.org/10.3389/fpubh.2021.670352.
Shim JG, Ryu KH, Cho EA, Ahn JH, Kim HK, Lee YJ, Lee SH. Machine learning approaches to predict chronic lower back pain in people aged over 50 years. Med (Kaunas). 2021;57(11). https://doi.org/10.3390/medicina57111230.
Maheswari S, Pitchai R. Heart disease prediction system using decision tree and naive Bayes algorithm. Curr Med Imaging Rev. 2019;15(8):712–7. https://doi.org/10.2174/1573405614666180322141259.
Kalafi EY, Nor N, Taib NA, Ganggayah MD, Town C, Dhillon SK. Machine learning and deep learning approaches in breast cancer survival prediction using clinical data. Folia Biol (Praha). 2019;65(5–6):212–20.
Rigatti SJ. Random forest. J Insur Med. 2017;47(1):31–9. https://doi.org/10.17849/insm-47-01-31-39.1.
Dong Y, Peng CY. Principled missing data methods for researchers. Springerplus. 2013;2(1):222. https://doi.org/10.1186/2193-1801-2-222.
Kotekar N, Shankar A, Nagaraj R. Postoperative cognitive dysfunction - current preventive strategies. Clin Interv Aging. 2018;13:2267–73. https://doi.org/10.2147/CIA.S133896.
Newman MF, Kirchner JL, Phillips-Bute B, Gaver V, Grocott H, Jones RH, Mark DB, Reves JG, Blumenthal JA. Longitudinal assessment of neurocognitive function after coronary-artery bypass surgery. N Engl J Med. 2001;344(6):395–402. https://doi.org/10.1056/NEJM200102083440601.
Cai S, Zhang X, Pan W, Latour JM, Zheng J, Zhong J, Gao J, Lv M, Luo Z, Wang C, Zhang Y. Prevalence, predictors, and early outcomes of postoperative delirium in patients with type a aortic dissection during intensive care unit stay. Front Med (Lausanne). 2020;7:572581. https://doi.org/10.3389/fmed.2020.572581.
Krupa S, Friganovic A, Medrzycka-Dabrowska W. Occurrence of delirium during ECMO therapy in a critical care unit in Poland-a cross-sectional pilot study. Int J Environ Res Public Health. 2021;18(8). https://doi.org/10.3390/ijerph18084029.
Sanders KM, Stern TA, O'Gara PT, Field TS, Rauch SL, Lipson RE, Eagle KA. Delirium during intra-aortic balloon pump therapy. Incidence and management. Psychosomatics. 1992;33(1):35–44. https://doi.org/10.1016/S0033-3182(92)72019-2.
Sanders KM, Stern TA. Management of delirium associated with the use of the intra-aortic balloon pump. Am J Crit Care. 1993;2(5):371–7.
Ely EW, Barr J. Pain/agitation/delirium. Semin Respir Crit Care Med. 2013;34(2):151–2. https://doi.org/10.1055/s-0033-1342974.
Obuchowski NA, Bullen JA. Receiver operating characteristic (roc) curves: a review of methods with applications in diagnostic medicine. Phys Med Biol. 2018;63(7). https://doi.org/10.1088/1361-6560/aab4b1. 1T-7T.
A.F.G.K. MR, Berthold. Advances in intelligent data analysis xviii. Springer International Publishing; 2020.
Bernhard S, Koji T, Jean-Philippe V. Advanced application of support vector machines. MIT Press; 2004. p. 275.
Bernhard S, Alexander JS. Support vector machines. MIT Press; 2001. pp. 187–8.
Shipe ME, Deppen SA, Farjah F, Grogan EL. Developing prediction models for clinical use using logistic regression: an overview. J Thorac Dis. 2019;11(4):574–S584. https://doi.org/10.21037/jtd.2019.01.25.
Wang QQ, Yu SC, Qi X, Hu YH, Zheng WJ, Shi JX, Yao HY. Zhonghua Yu Fang Yi Xue Za Zhi. 2019;53(9):955–60. https://doi.org/10.3760/cma.j.issn.0253-9624.2019.09.018. [overview of logistic regression model analysis and application].

No competing interests reported.

Download PDF

Journal Publication

published 18 Jan, 2024

Read the published version in BMC Cardiovascular Disorders →

Editorial decision: Major revision
04 Oct, 2023
Reviews received at journal
20 Sep, 2023
Reviewers agreed at journal
20 Sep, 2023
Reviews received at journal
14 Sep, 2023
Reviewers agreed at journal
13 Sep, 2023
Reviewers invited by journal
04 Sep, 2023
Editor assigned by journal
04 Sep, 2023
Editor invited by journal
01 Sep, 2023
Submission checks completed at journal
01 Sep, 2023
First submitted to journal
01 Aug, 2023

You are reading this latest preprint version

A machine learning-based prediction model for postoperative delirium in cardiac valve surgery using electronic health records

Status:

Journal Publication

Version 1

Abstract

Figures

INTRODUCTION

Materials and methods

Study Population

Selected Variables

Assessment of Delirium

Machine Learning Algorithms and Analysis

RESULT

Participants characteristics

Model development

Model performance

DISCUSSION

CONCLUSION

Declarations

References

Additional Declarations

Status:

Journal Publication

Version 1