Polyphenic Risk Score Shows Robust Predictive Ability For Long-Term Future Suicidality

doi:10.21203/rs.3.rs-1442969/v1

Suicides are preventable tragedies, if risk factors are tracked and mitigated. We had previously developed a new quantitative suicidality risk assessment instrument (Convergent Functional Information for Suicidality, CFI-S), which is in essence a simple polyphenic risk score, and deployed it in a busy urban Emergency Department, in a naturalistic cohort of consecutive patients. We report a four years follow-up of that population (n = 482). Overall, the single administration of the CFI-S was significantly predictive of suicidality over the ensuing 4 years (occurrence- ROC AUC 80%, severity- Pearson correlation 0.44, imminence-Cox regression Hazard Ratio 1.33). The best predictive single phenes (phenotypic items) were feeling useless (not needed), a past history of suicidality, and social isolation. We next used machine learning approaches to enhance the predictive ability of CFI-S. We divided the population into a discovery cohort (n=255) and testing cohort (n=227), and developed a deep neural network algorithm that showed increased accuracy for predicting risk of future suicidality (increasing the ROC AUC from 80% to 90%), as well as a similarity network classifier for visualizing patient’s risk. We propose that the widespread use of CFI-S for screening purposes, with or without machine learning enhancements, can boost suicidality prevention efforts. This study also identified as top risk factors for suicidality addressable social determinants.

Suicidality

Emergency Department

Risk

Prediction

Machine Learning

Social Isolation

“It's tough to make predictions, especially about the future.”

-Yogi Berra

One person dies by suicide every 40 seconds worldwide. Suicides are preventable tragedies. The social, psychological and biological risk factors are increasingly known¹ ² ³. What has been missing in practice has been a quantitative tool that identifies this risk, provides actionable information, and pivots to a personalized treatment plan. Similar to genetics, where polygenic risk scores can provide some predictive ability, we have developed in recent years for suicidality both polygenic risk scores (based on blood biomarker transcriptomic data), and a polyphenic risk score (based on phenotypic data, i.e., known risk factors for suicidality). The latter was described as an instrument, Convergent Functional Information for Suicidality, CFI-S, which was used in veterans¹, as well as a civilian population⁴. The civilian study was a naturalistic study in a cohort of all-comers to a busy urban hospital Emergency Department. The patients received the standard 2-item suicide screening question, were seen by an attending physician who was asked to fill a VAS scale (visual analog scale, based on physician gestalt) about likelihood of future suicidality (ideation, planning, attempts, hospitalizations) in subsequent 6 months, and received a paper version of the CFI-S. Of note, the CFI-S does not ask about suicidal ideation. At 6 months follow-up, the CFI-S was more predictive of suicidality than the other two assessments, particularly in women⁴. An important issue to be addressed was its ability to predict long-term risk. We describe a 4-year follow-up of that population, and analyses using state-of the art machine learning approaches.

Cohorts

Participants were enrolled in 2016–2017 in the study “Assessing Risk of Future Suicidality in Emergency Department Patients”⁴. All follow-up was done via electronic medical record (EMR) review. The electronic medical record was reviewed by 3 psychiatrists independently. Scores for suicidality (suicidal ideation, suicidal planning, suicide attempt, hospitalization due to suicidality) were compared between the different scorers, with looking at all notes in the EMR that had been entered since each participant was initially enrolled in the first study in 2019 (with the first enrollment of patients being 6/17/2016). Following this the scores were normalized by time through finding the shortest length of time between enrollment date and chart access, and eliminating any scores performed at a longer duration than that shortest follow-up date.

In addition to review of the EMR to gather information regarding suicidality, the Marion County Coroner’s Office database was also searched. None of the participants were found to have died by suicide locally.

Traditional Analyses

Following the collection of data a few separate analyses were performed. First, a Receiver Operating Curve Area Under the Curve (ROC AUC ) for predicting future suicidality was calculated, for the 4-year follow-up. Second, a t-test was done comparing the CFIS score between those with suicidality compared to those who did not have any over the 4 year follow-up. The analysis was also conducted at an individual item level. Third, a Pearson correlation analysis was done to determine how the CFIS scores compare to the severity of suicidality. Suicidality was rated by severity in the following manner: those with suicidal ideation only received a score of 1, those with a plan received a score of 2, those with suicide attempt(s) received a score of 3, and those with hospitalization(s) for suicidality received a score of 4. Evaluating suicidality as a spectrum is supported by our previous work^1,5,6, which is consistent also with suicidality being its own free-standing diagnostic entity⁷. Fourth, a Cox-regression was performed, looking at the Hazard Ratio for the CFIS score predicting future suicidality.

Machine Learning Analyses

We performed a machine learning-based suicidality classification using the CFI-S records as input. Each patient has a CFI-S record, e.g., answers to 22 yes/no questions. The input or feature of our machine learning models are vectors with length 22 + 3, i.e., 22 yes/no answers, total number of yes, total number of answers, and the CFI-S score. The experiment is designed as follows: for input data, we use the afore-mentioned features to feed in machine learning models, and the suicidality results (ideation, planning, attempt, hospitalization) are converted to binary indicators to represent if a patient has suicidality or not. Therefore, to solve this suicidality classification problem we make use of five machine learning models, naive Bayes (NB)⁸, XGBoost (XGB)⁹, random forest (RF)¹⁰, support vector machines (SVM)¹¹, and deep neural network (DNN)¹² classifiers. We train and tune the hyper-parameters of our machine learning models with the discovery cohort, which is commonly accepted as the training set in the machine learning field. We then fix the model and hyperparameters and test on an independent test cohort, which is referred to as the test set in machine learning. The results of the test cohort reveal better the generalizability of our approach. In this classification problem, we used AUROC, accuracy, precision, recall, and F1-score as evaluation metrics to comprehensively compare and evaluate our models. Detailed formulas for evaluation metrics are described in Supplementary Materials.

In the binary classification of suicidality, we would like to predict if a patient will have or not any suicidality. In what follows, we further inspect those patients who have suicidality, and we predict two tasks: (i) how soon the patients would take such actions (imminence prediction), and (ii) how severe the behavior would be (severity prediction).

In imminence prediction, we take the first time (in terms of month) of actions as imminence labels, and the input features are the same as in suicidality classification. For example, if a patient has suicidal ideation recorded 1 month after she/he take the CFIS test, a suicide attempt recorded 3 months after, and a hospitalization recorded 4 months after, we take the earliest record and label the imminence as 1 month.

In severity prediction, we take weighted severity as labels, i.e., severity is composed of four parts: ideation (SI), planning (SP), attempt (SA), and hospitalization (HP), and they weigh differently based on the severity of the action. More specifically, SI = 1, SP = 2, SA = 3, and HP = 4. Different from suicidality classification, imminence and severity prediction are regression problems, and we use DNNs for prediction and evaluate it by accuracy with prediction interval (PI), root mean squared error (RMSE), mean absolute error (MAE), R-squared, and standard deviation.

The accuracy with PI is used to calculate the accuracy for our regression problem. Assume we have a data point with feature x and ground truth label y, DNN takes x as input and predicts the output value y’. Then, the actuarial prediction made by DNN model is an interval: [y’-PI, y’+PI], where PI is prediction interval calculated as z*stdev (z takes values ranging from [1.15, 2.58] for 75–99% PI, and stdev is the standard deviation of y’s).

Deep neural networks’ hyperparameters and training details

The DNN we used in suicidality prediction is designed for classification with about 400k parameters. The input layer has 25 neurons and there are two hidden layers with 64 neurons each. Each fully connected hidden layer is followed by a dropout layer with 50% keep rate and the ReLU¹³ activation function is applied to all hidden layers. We followed standard deep neural network hyperparameter tuning methods and determined that a neural network with 400k parameters is a decent model for this binary classification problem. Dropout is introduced to avoid overfitting. The output layer has 2 output neurons with the Softmax¹⁴ activation function being used as the activation function. We use Adam¹⁵ optimizer with 0.001 learning rate and binary cross entropy loss function. We split the discovery cohort (255 samples) as a training (80%) and an evaluation (20%) set for hyperparameter tuning, and the discovery results reported are evaluated on the evaluation set. After hyperparameter tuning, we train the DNN with the whole discovery cohort and lock it for testing on the test cohort (227 samples). Other machine learning models’ hyperparameter tuning, training and test follow exactly the same procedure for fair comparison.

The DNN we used in imminence and severity prediction is designed for regression. The input layer has 25 neurons, and there are 2 hidden layers. The first hidden layer is a fully connected layer with 32 hidden neurons and Sigmoid¹⁶ activation function. The second hidden layer is a fully connected layer with 64 hidden neurons and ReLU activation function. Both hidden layers are followed by a dropout layer with 50% keep rate. The final output layer contains only 1 output neuron and a ReLU activation function is applied to rectify the output to a non-negative value. We follow standard deep neural network hyperparameter tuning and use Adam optimizer with 0.001 learning rate and a mean squared error loss function. In severity and imminence prediction, we take those samples with suicidality = 1 and this results in 56 samples selected from the discovery cohort and 50 samples selected from the test cohort. The hyperparameter tuning, training and test for all machine learning models in imminence and severity prediction follow the same procedure as in suicidality prediction.

Network representation

We form a similarity network of all patients based on their CFI-S records. In this network, each node represents a patient, and the link between two nodes (patients) are the similarity between these two nodes. With the answers to the questionnaire, each patient is described by a vector of size 22 + 3, where yes and no are taken as 1 and 0, other answers are taken as -1. With these vectors, we calculate the cosine similarity between vectors as the weight between two nodes. With node representation and weights, we construct an all-to-all network first, and then delete connections between nodes according to a threshold in the edge’ weights, i.e., if the weights between two nodes are less than a threshold (0.995), we delete that link. In the end, only patients with very similar CFI-S records are connected with a link, and they are located closer to each other in the graph. Graph neural networks (GNNs) are advanced neural network architectures developed based on graph theory concepts.¹⁷ We also develop similarity network based GNNs to do suicidality prediction. Network Representation Link: https://github.com/cmxxx/SI.

Traditional Analyses

After collecting four-year follow-up data through electronic medical records chart review, analyses were performed using simple statistical tools.

The overall CFI-S score was predictive of any future suicidality (ideation, planning, attempts, hospitalizations) with a ROC AUC of 0.798 and a p-value of 2.39 E-21 (Figure 1a).

The average CFI-S score for those with future suicidality was 54 vs. 31 for those without future suicidality, with a t-test p-value of 1.46 E-22 (Figure 1 b).

We also examined the correlation of the CFI-S score with suicidality severity - suicidal ideation (SI) receiving a score of 1, suicide plan (SP) receiving a score of 2, suicide attempt (SA) receiving a score of 3, and hospitalization for suicidality receiving a score of 4. The Pearson’s correlation R-coefficient was 0.44 , p-value of 2.91 E-24(Figure 1 c) .

Additionally, a Cox regression was used to determine imminence of suicidality, producing a Hazard Ratio of 1.33 with a p-value of 7.53 E-03 and a one tailed t-test with a value of 3.76 E-03 (Figure 1 c).

A t-test was also performed for each individual CFI-S items between those with suicidality and those without (Figure 1 d). The top item (p-value 6.29 E-26, 12 orders of magnitude higher than the second best) was perceived uselessness (not needed, and/or feeling like a burden to kin). The next top items, in order, were past suicidality (1.57 E-14), social isolation (2.40 E-14). hopelessness (6.17 E-13), and past history of a mental health diagnoses (9.54 E-13).

Machine Learning Analyses

Machine learning has the ability to extract more out of data, and it has been used for various medical diagnosis, such as tree-based models in PTSD assessment,¹⁸ naïve Bayes, random forest, and support vector machines in lung cancer prognosis¹⁹, XGBoost for kidney disease diagnosis²⁰. We developed a comprehensive machine learning framework for predicting future suicidality occurrence, severity, and imminence.

The future suicidality prediction is formulated as a binary classification problem. We developed a deep neural networks (DNN) framework, and compared it with other classical machine learning classifiers - native Bayes (NB), XGBoost (XGB), random forest (RF), support vector machines (SVM).

The receiver operating characteristic (ROC) curve, accuracy, precision, recall evaluation metrics, F1 score, and area under receiver operating characteristic (AUROC) results in Figure 2(b-c) show that the constructed RF and DNN classifiers exhibit superior performance compared to the other classical machine learning classifiers for the discovery and test cohorts, respectively. For the results shown in this figure, we train and tune hyper parameters of our machine learning models with a discovery cohort, and then we get the test result by testing our models with an independent test cohort. Therefore, models that achieve good results in the test cohort are better than models that perform well in the discovery cohort. I.e., DNN achieves higher results in the test cohort, which demonstrates its generalization ability. The proposed DNN model is a complex and high-performance deep learning model, that takes CFI-S information as input and learns to utilize input data intelligently to achieve best performance possible within training time (see Methods section, Deep neural networks’ hyperparameters and training details for model details).

In addition to classical machine learning and deep neural network classifiers, we constructed patient similarity networks (see Methods section, Network representation for network representation details). With graph visualization shown in Figure 3, we can locate and visualize a new patient in the graph based on collected CFI-S records, which is useful for potential early stage screening. Imagine a case where a patient takes 15 minutes and provides the CFI-S record, we can then compute the similarity between this CFI-S and all the other records we have in the system, then visualize the location of this patient in the graph. The graph has approximately 2 parts, the smaller area located in the lower left corner, which is a “high-risk” area, and the larger area located in the upper right corner of the graph is a “low-risk” area. With patients located in the graph, we provide a fast early-stage screening through graph neural network (GNN). GNN is an advanced graph based neural network model that works well on data that can be represented in graph or network. We formulate our GNN with this similarity network and provide a SI prediction. From the results shown in Figure 3(c), we can see that similarity network based GNN not only operates as an advanced classification model, but also provides explainability through visualization.

Different from predicting future suicidality occurrence, the severity and imminence predictions are formulated as regression problems. Severity represents the weighted score in relation to the severity of suicidality of a patient in a 4-year follow-up. Imminence refers to the time (month) elapsed between the CFI-S assessment and the first instance of suicidality of a patient. We used our DNN framework to investigate these two regression problems. Figure 2(d-e) summarize the prediction results of the proposed DNN model and other classical machine learning models (see also Supplementary Materials section X for details on the experimental setup and additional results.) The accuracy in Figure 2d for the severity prediction and imminence prediction ranges between 85% to 100% and 90% to 96%, respectively, for increasing prediction intervals. The results for the test cohort are slightly lower than those in the discovery cohort. This demonstrates that the proposed deep learning framework can prove instrumental in suicide investigation, and may generalize well in external and future cohorts.

A simple, easy to administer, 22 items polyphenic risk score scale for suicidality, the CFI-S, which encompasses known risk factors but does not ask about suicidal ideation, was administered in an Emergency Department setting. The score was predictive of suicidality over the long-term, i.e., the ensuing 4 years of follow-up, with an AUC of 80% and a Hazard Ratio of 1.33. This simple tool can be used in any setting, and provide a personalized mitigation plan by looking at the items that tested positive.

Machine learning approaches boosted the predictive ability up to 90%. By combining these analyses with a network science visualization, we developed a similarity network classifier for visualizing patient’s risk . Our CFI-S-based graph has two main components: a smaller area located in the lower left corner, which represents patients of “high-risk” suicidality, and a larger area located in the upper right corner of the graph, which represents patients with a “low-risk” of suicidality. This can be used for early-stage screening of suicidality risk, by showing where a new individual fits, based on the CFI-S risk score, compared with a well-studied normative cohort such as ours. For instance, one can exploit our framework for developing an AI system that can analyze the CFI-S answers of a patient during a short (5-10 minutes) interview, estimate the higher-order similarity scores between these newly recorded CFI-S scores and those in the cohort of patients prone to suicidality, visualize the position of the patient on the graph and determine a personalized strategy to mitigate suicide risk. Our proposed framework can be extended to encompass multiple data modalities (e.g., time taken to respond, hesitation to respond to specific questions, tendency to avoid some clear answers), in order to identify additional signatures of cognitive, physiological and behavioral nature that can help better predict suicidality.

Besides CFI-S, other types of information about the patients (e.g., age, ethnicity, career related metrics, social engagement) could be infused in the future into the machine learning framework to improve the model performance. For instance, the time spent on each CFI-S question, and facial expression images or videos, could be useful in inferring the emotion and trustworthiness of CFI-S answers. Genomic biomarker data, and other mental health phenotypic data, could also be integrated alongside the CFI-S, in a bio-psycho-social algorithm. This is the direction our future collaborative work is taking.

Lastly, it is particularly interesting, and actionable for preventive approaches, that feeling useless and socially isolated are top risk factors identified by our studies. Shrinking of one’s life leads to death.

Acknowledgements

We would like to thank Dr. Helen Le-Niculescu for useful advice and discussions. We also would particularly like to thank the participants in these studies. Without their contribution, such work to advance the understanding of suicidality risk would not be possible. This work was supported by an NIH grant (R01MH117431) and a VA Merit Award (2I01CX000139) to ABN.

Author Contributions

ABN and PB designed this study. ABN, PB, MC and KR wrote the manuscript. MC and KR, analyzed the data. KR, YC, and LQ scored electronic medical records. MG and GS conducted coroner’s office data reviews. JK organized the initial ED data collection and assisted with data interpretation. All authors discussed the results and commented on the manuscript.

Conflict of Interest

ABN is listed as inventor on a patent application filed by Indiana University on suicidality risk prediction, and is a co-founder of MindX Sciences.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding authors on reasonable request.

Niculescu, A. B. et al. Precision medicine for suicidality: from universality to subtypes and personalization. Mol Psychiatry22, 1250-1273, doi:10.1038/mp.2017.128 (2017).
Boggs, J. M. et al. General Medical, Mental Health, and Demographic Risk Factors Associated With Suicide by Firearm Compared With Other Means. Psychiatr Serv69, 677-684, doi:10.1176/appi.ps.201700237 (2018).
Nock, M. K. et al. Prediction of Suicide Attempts Using Clinician Assessment, Patient Self-report, and Electronic Health Records. JAMA Netw Open5, e2144373, doi:10.1001/jamanetworkopen.2021.44373 (2022).
Brucker, K. et al. Assessing Risk of Future Suicidality in Emergency Department Patients. Acad Emerg Med26, 376-383, doi:10.1111/acem.13562 (2019).
Levey, D. F. et al. Towards understanding and predicting suicidality in women: biomarkers and clinical risk assessment. Mol Psychiatry21, 768-785, doi:10.1038/mp.2016.31 (2016).
Niculescu, A. B. et al. Understanding and predicting suicidality using a combined genomic and clinical risk assessment approach. Mol Psychiatry20, 1266-1285, doi:10.1038/mp.2015.112 (2015).
Oquendo, M. A., Baca-Garcia, E., Mann, J. J. & Giner, J. Issues for DSM-V: suicidal behavior as a separate diagnosis on a separate axis. Am J Psychiatry165, 1383-1384, doi:10.1176/appi.ajp.2008.08020281 (2008).
Domingos, P. & Pazzani, M. On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning29, 103–130 (1997).
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785-794.
Ho, T. K. Random decision forests. in Proceedings of 3rd international conference on document analysis and recognition. 278-282 (IEEE).
Cortes, C. & Vapnik, V. Support-vector networks. Machine learning20, 273-297 (1995).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015).
Glorot, X., Bordes, A. & Bengio, Y. . Deep sparse rectifier neural networks. in Proceedings of the fourteenth international conference on artificial intelligence and statistics. 315-323 (JMLR Workshop and Conference Proceedings).
Goodfellow, I., Bengio, Y. & Courville, A. Deep Feedforward Networks. Deep learning 164–223 (2016).
Kingma, D. P. B., J. Adam: A method for stochastic optimization. in International Conference for Learning Representations (San Diego, CA, 2014).
Han, J. & Moraga, C. The influence of the sigmoid function parameters on the speed of backpropagation learning in International workshop on artificial neural networks. 195-201 (Springer).
Zhou, J. et al. Graph neural networks: A review of methods and applications. AI Open1, 57–81 (2020).
Brenner, L. A. et al. Development and Validation of Computerized Adaptive Assessment Tools for the Measurement of Posttraumatic Stress Disorder Among US Military Veterans. JAMA Network Open4, e2115707–e2115707 (2021).
Yu, K.-H. et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nature communications7, 1–10 (2016).
Ogunleye, A. & Wang, Q.-G. XGBoost model for chronic kidney disease diagnosis. IEEE/ACM transactions on computational biology and bioinformatics17, 2131–2140 (2019).

Table 1. Aggregate Demographics

Analyses

Cohort

Number of Participants

Gender

Ethnicity

Age Mean

(SD)

Traditional

No Suicidality

376

Male 180

Female 195

Other 1

EA 192

AA 158

Hispanic 15

Asian 2

Other 9

44.6

(14.8)

Suicidality

106

Male 55

Female 51

EA 56

AA 44

Hispanic 2

Other 2

American Indian 1

Asian 1

39.6

(13)

Machine Learning

Discovery Cohort

No Suicidality

255

Suicidality

56

Male 128

Female 126

Other 1

EA 136

AA 106

Hispanic 10

Asian 2

Other 1

43.5

(14.8)

Test Cohort

227

No Suicidality

50

Suicidality

Male= 107

Female = 120

EA 112

AA 96

Other 9

Hispanic 7

American Indian 1

Asian 1

Other 1

43.5

(14.4)

Competing interest reported. ABN is listed as inventor on a patent application filed by Indiana University on suicidality risk prediction, and is a co-founder of MindX Sciences.

SupplementaryMaterials.docx

Polyphenic Risk Score Shows Robust Predictive Ability For Long-Term Future Suicidality

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Cohorts

Traditional Analyses

Machine Learning Analyses

Deep neural networks’ hyperparameters and training details

Network representation

Results

Discussion

Declarations

References

Table

Additional Declarations

Supplementary Files

Status:

Version 1