Assessment of important cardiovascular risk factors using ML methods: a randomized controlled trial

doi:10.21203/rs.3.rs-1713385/v1

Download PDF

Article

Assessment of important cardiovascular risk factors using ML methods: a randomized controlled trial

https://doi.org/10.21203/rs.3.rs-1713385/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: High prevalence and mortality of cardiovascular diseases (CVD) are global problems. Many countries focus their healthcare on secondary treatment, and primary prevention is in the background. Focusing on risk factors (RF) screening of CVD in personalized medicine has great potential for the future. We present a methodology for identification of important RF as well as the potentially new once.

Materials and Methods: We worked with the dataset of patients hospitalized in the East Slovak Institute of Cardiovascular Diseases in Košice. The file contained 808 records, complete history, laboratory tests, ECG, echocardiography, and selective coronary angiography. We analyzed the importance of variables based on CART, Random forest, and Logistic regression algorithms (binary classification) and propose a new weighted agglomerative attribute importance metric. After selection of potentially important, but less known RF we re-deployed the CART algorithm on selected combinations of risk factors, while the target attribute was divided into six original classes corresponding to the severity of the coronarography finding.

Results: Selected important variables based on the proposed metric are in accordance with known results, but also pointed to some potentially relevant RF such as fibrinogen. The experiments confirmed that fibrinogen might have the potential to help determine cardiovascular risk. However, its impact is debatable, but its potential increases if it is potentiated by factors other than HDL. We also concluded that higher HDL levels might have a cardio protective effect.

Conclusions: Our study proposed original methodology for identification of important RF and in depth analysis of potentially interesting new RF. Results showed that fibrinogen could be one of the critical risk factors of cardiovascular diseases. Experimental results suggest that there may be others besides traditional RF of CVD, such as the fibrinogen and HDL levels we investigate. Still, studies with a larger patient population are needed to draw significant conclusions.

variable importance

cardiovascular risk factors

machine learning methods

variable selection

data analysis

cardiovascular diseases

Cardiovascular diseases (CVD) are global and still growing problem. They do not concern only countries with a low or medium socioeconomic aspect, but they are the leading cause of death in America, Europe, or Asia. According to the World Health Organization (WHO), one person dies from CVD every 36 seconds in America [1]. In Europe, CVD causes 46 times more deaths than AIDS, tuberculosis, and malaria [2]. A 2019 report says that of the total number of deaths due to CVD, representing 18.6 million deaths, up to 58% occurred in Asian countries [3][4]. However, according to the WHO [4], up to 80% of premature heart disease and stroke can be prevented by addressing behavioral risk factors (RF), early detection, and subsequent disease management [5].

Behavioral RF solutions focus on lifestyle change, such as influencing modifiable risk factors such as smoking, alcohol consumption, or sufficient physical activity and the associated healthy eating and obesity reduction. Non-modifiable RF, such as predispositions to CVD, cannot be influenced. However, early detection of emerging symptoms can improve patients' quality of life or even significantly prolong life. Here we come to early screening, focusing on RF combinations, and providing adequate and affordable treatment to affected patients [2].

Concerning CVD prevalence, the WHO has committed itself to reduce premature mortality from non-communicable diseases (including CVD) by 25% by 2025 [5]. This commitment is accompanied by targets such as reducing the prevalence of hypertension by 25% between 2010 and 2025, or at least 50% availability of drug therapy and counseling to prevent heart attacks and strokes.

RF screening is guided by several strategies divided into detection mechanisms (laboratory markers, noninvasive imaging methods, physical examinations) and detection blood pressure, lipids, coronary artery calcium scores, or electrocardiography. In addition to the laboratory tests mentioned above, costly examinations are often introduced within CVD management. However, at least in our conditions, the availability of examinations is not sufficient and the waiting time for which examinations we expect in months (waiting list for selective coronarography). In contrast, the development of technologies and available options in conjunction with artificial intelligence and machine learning methods provides various options. We focus on determining the increased risk of CVD in individual patients (classifying the patient into a class with a positive CVD diagnosis) or pointing out the potential risk of the monitored patient parameters (identification of critical parameters that can be influenced). With this approach, it is possible to reduce waiting times for individual examinations and reduce useless examinations, as it is possible to predict the result of individual examinations with a sure accuracy. It is also possible to mark the risk parameters of the monitored patient based on historical data. In addition, the connection of artificial intelligence (via machine learning methods) with medical background brings a new aspect - personalized medicine (PM). PM moves prevention of CVD to better primary screening of many factors at one time, and it allows research the power of less well-known RF as soon as new potential significant RF (say based on proteomic or mRNA).

In this article, we focus on assessing the significance of factors affecting the severity of CVD. Our goal is not only to classify patients with a focus on heart disease, but also to confirm or refute any chance of the disease. Our effort will use selected machine learning methods to pinpoint the most critical RF CVD. The first step in the evaluation will be to verify that the results of us analyzes are in line with medical practice and guidelines. Subsequently, we will try to point out the less considered RF and compare the analysis results with the previously achieved results using a different approach. We expect that the results achieved can be helpful in the early diagnosis of CVD and preventive screening, which is in line with the WHO's objectives of reducing CVD prevalence.

Despite numerous diagnostics and treatment suggestions, cardiovascular diseases (CVDs) continue to raise questions about their diverse risk factors. To help physicians and researchers, the authors of the study [6] conducted experiments and proposed a novel ranking and attribute selection algorithm. The team of Wei-Yen Hsu worked with a dataset from Mackay Memorial Hospital in Taipei, Taiwan. Ultimately, from more than 70 attributes only 21 most important characteristic factors/attributes suitable for further analysis were chosen by senior doctors and screened for recommendation in related studies, including personal physical characteristics, personal vices, or laboratory blood tests. Also, the dependent variable hs_crp (high-density lipoprotein, which detects low-grade inflammation [7]) was included. Authors also identified fibrinogen as potential risk of vascular infarction.

When these 21 factors were picked out, seven classifiers (Bagging, NBTree, BayesNet, RBFNetwork, Kstar, Random Forest, and J48) were applied. The best accuracy achieved RBFNetwork model with 51.79%, followed by BayesNet with 50.52%, and NBTree with 50.33%. The authors also sorted attributes with their version of the improved mRMR algorithm (filter-based selection method that uses a measure to score the feature subset which is independent of a specific classifier [8]), which yielded attribute ranking. The most important and influential attributes were age, exercise, weight, height, waist circumference, and some laboratory blood test parameters (LDL, BMI, glucose). The order of attributes included, among others, the ones corresponding to gender, smoking habit, blood triglyceride levels, HDL, or creatinine. When the number of used attributes was reduced, thanks to use of mRMR algorithm, accuracy of the resulting RBFNetwork model increased to 66.9% (this was again the best model among all algorithms used). As we can see, higher accuracy was reached with decreased number of attributes. For example, the lowest number of chosen factors for Bagging was four (under the condition of the best accuracy), while the highest number of selected factors for RBFNetwork was 15.

In [9] authors reviewed other cardiovascular risk factors and morbidity and mortality from coronary heart disease among the Turkish population. The focus was on lipids and lipoproteins, while other relevant risk factors were also discussed. The most influential factors on CVDs have been analyzed separately for men and women. The data were obtained via questionnaire, physical examination of the cardiovascular system, and recording of a resting ECG. Authors performed mode experiments, they used e.g. multiple forward stepwise logistic regression analyses. This method worked with 13 risk parameters and revealed that systolic blood pressure, glucose, LDL, and HDL cholesterol are independent determinants of CHD. Total and LDL cholesterol, blood pressure, waist circumference, and body weight were deemed independent determinants in women. Conventional risk factors appear to operate at the same level as Western countries, and however, the role of LDL is presumably weighted towards the latter.

The statement about the importance of systolic blood pressure, smoking status, diabetes, and BMI is also confirmable in comparing the associations between CVD risk factors in Asia and Australasia. Authors of the article [10] studied the outcomes, such as CHD, ischemic and hemorrhagic stroke. The hazard ratios were estimated by COX models from risk factors including systolic blood pressure, total cholesterol, triglycerides, BMI, diabetes, and smoking status stratified by study and sex. All were adjusted for age, the other risk factors, and regression dilution. Triglycerides were more strongly associated with CHD in New Zealand and Australia, while systolic blood pressure showed a more robust relationship with hemorrhagic stroke in Asia. Research team concluded, that CVD factors are almost similar in Caucasian and Asian populations. As mentioned in the previous study [9] authors deemed the risk factor diabetes as very significant in coronary heart diseases and deaths.

In contrast, Fuller et al. in [11] examined classic cardiovascular risk factors and diabetes-specific factors. Fuller et al. worked with 4743 diabetic patients who have been followed up for 12 years, and they assessed the incidence of fatal and nonfatal CVD outcomes. Fuller et al. conclude that variables such as glycemic control, proteinuria, and retinopathy must be included in assessing the cardiovascular disease risk on the side of classic risk factors, blood pressure, smoking, and dyslipidemia. Looking at the factors mentioned in this study and screening through other articles, we ought to explore the importance of blood pressure, serum cholesterol, fibrinogen and triglyceride, as authors deem it important.

Many epidemiologic studies have proven a clear connection between specific risk markers and CVD. Authors of the study [12] divide them into two categories, such as risk factors (these have been proven to be causal) and risk markers (they show the associations with CVD, but for whom a cause-and-effect association is yet to be proven). Authors also provide information that these risk markers could be classified as predisposing, such as obesity that could work through raising blood pressure, glucose, and lipids, and direct, such as smoking. Risk factors that authors included and are causally linked include tobacco consumption, elevated LDL, low HDL, high blood pressure, elevated glucose, physical inactivity, obesity, and diet. In other words, the authors provide almost the same set of risk factors as mentioned in previous studies.

A good example would be obesity because that raises blood pressure, causes dyslipidemia, and increases blood glucose. Some of the predisposing risk factors may likely impact direct effects. Another category is risk markers, which show associations. They include low socioeconomic status, elevated prothrombotic factors fibrinogen, PAI-1, markers of infection or inflammation (such as CRP (C Reactive Protein) or hs-CRP), elevated homocysteine, elevated lipoprotein(a), and psychological factors, such as depression, anger issues, stress, or acute life events. Fibrinogen made an appearance once again and authors assigned it to the risk markers category, which means that the potential risk is yet to be proven.

As Sanchez-Pinto et al. mention in [13], the main goal of variable selection methods is to increase the ratio of replicable variables with an authentic relationship with the outcome (signal) and non-replicable variables with only an idiosyncratic relationship with the outcome (noise). When this ratio increases, the main goal is to develop appropriately fitted models that will assume correct and high-accuracy predictions when unseen, and new data are imputed. This idea is also used in our research, as was mentioned, the goal is to identify new and significant risk factors for CVDs. The authors analyzed two datasets, the first being a multicenter observational cohort dataset with almost 270.000 patients and the second being a single-center observational cohort dataset with 6564 critically ill children. Both datasets included laboratory results, vital signs, and patient demographics. The smaller dataset also included the medications from the first 12 hours in ICU and admission characteristics.

Sanchez-Pinto et al. analyzed the performance of eight different variable selection methods. Four are regression-based, namely backward stepwise selection using p-value, AIC, LASSO, and Elastic Net. The researchers also used four tree-based methods, Random Forest, Regularized Random Forest, Boruta, and Gradient Boosted Feature Selection. The important part is that in Random Forest, authors deemed the variables in the generated interpretation subset determined by the algorithm as the variables selected by random forest. The second subset generated by random forest includes highly correlated attributes.

The clinical trial used two evaluation methods, parsimony and performance change. Model is considered parsimonious when sparse and has good prediction accuracy [13]. In the first dataset (adult clinical deterioration cohort), the model Boruta was the most accurate, and the Gradient boosted feature selection (GBFS) was the sparsest. In the second pediatric dataset, the most accurate model was backward stepwise selection using p-value, and the sparsest was variable selection using Random Forest and GBFS. According to the authors, model sparsity had no apparent effects on a relationship with tree-based and logistic-based methods [13]. That means methods that generated fewer variables still performed the same or even better than others, same as in Wei-Yen Hsu study [6]. Based on this conclusion, we decided not to measure parsimony in our research, but we investigated other statistics and variable importance (VI).

This study also assigned a category from one to four to the variables in all models ranked by importance. This category was assigned to them based on the quartile of importance. The authors chose this approach to deal with different importance metrics that methods use and to allow comparison between methods. We use model performance metrics and combine them across more different models to better measure variable importance (VI).

Completely different approach to the same problem is presented in [14] using factor analysis (FA) approach, which helped us track the relationships between observed variables. The most interesting model was created by six factors describing essential characteristics of patients (Factor 1), renal parameters and fibrinogen (Factor 2), family predisposition of CVD (Factor 3), personal history of CVD (Factor 4), patients' lifestyle (Factor 5), and echo and ECG treatment results (Factor 6). The approximated FA model is presented in Table 1.

Table 1

The six-factors model of Factor Analysis, [14]
Factors	SS loadings	Factor loadings
Factor 1	1.97	Height; Gender; Weight
Factor 2	1.50	Urea; FBG; Creat
Factor 3	1.77	F_CAD; F_MI
Factor 4	1.75	P_MI; P_CAD; P_Stroke
Factor 5	1.76	Smoking; S_Freq; S_Duration; Alcohol
Factor 6	1.23	ECG_QRS; Age; ECHO_EF; ECKG_PBBB

(FBG – fibrinogen, Creat – creatinine, F_CAD – Family history of Coronary Artery Disease, F_MI – Family history of Myocardial Infarction, P_MI – Personal history of Myocardial Infarction, P_CAD – Personal history of Coronary Artery Disease, P_Stroke – Personal history of Stroke, S_count – count of smoked cigarettes per day, S_duration - duration of smoking in years, ECG_QRS – QRS complex, ECHO_EF – ejection fraction, ECG_RBBB – Right Bundle Branch Block)

The results obtained are mainly in line with medical practice [15]-[23]. A new impetus for further research was the conclusion of Factor 2, which points to the parameter of fibrinogen, which in some publications is considered potentially crucial in the context of CVD. Until now, this importance has not been supported by any significant research.

Research into the significance of atherosclerosis risk factors is relatively widespread, and several authors have focused on this issue from different angles depending on the methods used. The conclusions of the individual studies overlap considerably and correspond to general medical recommendations. To our knowledge, however, none of the studies, apart from the ranking of the importance of attributes as risk factors for CVD, has shifted its research towards examining and classifying new risk factors. As our previous research has indicated new possibilities in this area, we have decided, in addition to the above-mentioned order of importance of attributes, to focus our work on examining the impact of non-traditional factors on CVD.

A. Dataset

The experiments were performed on a dataset consisting of patients, who were hospitalized to perform selective coronarography (SC) at the East Slovak Institute of Cardiovascular Diseases in Košice in the period from June 2017 to March 2018. During hospitalization, patients were given a complete personal and family history, essential physical characteristics, laboratory tests, echocardiography, and ECG examination were performed, too. Patients' treatments in our clinical trial were performed by European Society of Cardiology guidelines for coronary artery diseases. All information about patients was written as electronic health records. Using the tool we created and tuned with our colleagues [24][25], it was possible to create a structured form of patients' data. Our previous research views the overview of the dataset's structure, statistics, and attribute values [14]. This publication also finds more detailed information on patient selection, inclusion and exclusion criteria, dataset preparation method (completion of missing data, treatment of extreme values, and other necessary steps to obtain a well-prepared dataset).

Our dataset was prepared as a two-classes dataset. It means that attribute Nalez (finding of SC) was aggregated from six classes to two classes: class 0 corresponds to no findings, class 1 corresponds to any positive findings of SC. Unfortunately, the dataset seems unbalanced (38.61% of class 0, 61.39% of class 1). However, it is necessary to realize that the class with the finding corresponds to several degrees of severity (from mild, coronary narrowing from 10%, to severe finding, when the narrowing of the coronary vessels is up to 100%).

B. Selected machine learning methods

Machine learning methods provide a wide range of options. With different types of data, we focus on different methods. However, it is sometimes difficult to say in advance which method will be most suitable for a particular dataset. It is not possible to say that one of the machine learning algorithms is the best. The suitability of an algorithm depends on several factors, whether the size of the data set, the nature of the attributes, the range of attribute values, or the desired outcome and focus on a specific goal. For this reason, we decided to perform experiments using different methods and approaches.

We also chose methods based on the state of art presented above. We focused on classification types of algorithms, as our goal was not to predict the final value of the coronarography finding but to determine the significance of the monitored attributes concerning their impact on the target attribute, based on the classification of patients according to the severity of the coronarography finding. Random forest and Logistic regression algorithms were a clear choice for us. As a third algorithm, we chose the CART decision tree to compare the performance of one decision tree versus several decision trees contained in the Random Forest models.

All the following experiments and analyzes took place in the R Studio environment of the R programming language. We divided the dataset to train and test set via 80:20 ratio. We further applied 10-fold cross-validation on train set for all methods in order to fine tune their hyperparameters. Each model was evaluated by its accuracy on train and test set, too.

The importance of the attributes was evaluated using the varImp function included in the caret package. For classification types of tasks, the output of this function is a value representing the difference of the accuracy of the forecast is recorded in the out-of-bag data section and the same after permutation of each predictor variable. Finally, each tree's accuracy is averaged and normalized by standard deviation [26].

As a nice example of a simple model with good performance we present FIGURE 1 the resulting CART model (its accuracy on the train set is 68.92% and 68.94% on the test set).

Although the model construction was used only four attributes (ECG_QRS, ECG_STE, ESC, P_CAD), there are slightly more important overall. An overview of the significance of the attributes is given in Table 3.

Random Forest (RaF) is characterized by the ability to deal with high dimensional data, unbalanced classes, or robustness to outliers, and provides also information about importance of particular variables. The number of trees in our model was set at 350, and we reached 71.57% accuracy at the train set and 71.42% accuracy at the test set. FIGURE 2 represents VI via Mean Decrease Accuracy (MDA, interpreted as the number or proportion of observations that are incorrectly classified by removing the feature) and Mean Decrease Gini (MDG, Gini coefficient interpret how each variable contributes to the homogeneity of the nodes and leaves in the resulting RaF).

In addition, we can also approach VI within the generated model via package caret. The range of VI was extensive, including both positive and zero and negative values. For the following research, we worked with all VI values of this model. We offer a comprehensive overview of the VI of the RaF model in Table 3, where we summarize, in addition to the value of significance itself, the order of individual attributes within individual models.

The last algorithm chosen for this part of the experiment is Logistic Regression (LR). The highest accuracy we achieved on the train set was 71.24%. An overview of the VI of the LR model is given in Table 3. This LR model achieved 68.94% accuracy at the test set.

C. Evaluating experiments

Before we state the significance of individual attributes within the model and some final evaluation of the relevance, we will focus on evaluating the "success" of the models as such.

The presented evaluation is based on the confusion matrix of individual models applied to the test set. We focused on the overall assessment of models in terms of larger number of statistics. The values of the individual statistics were comparable among models, but we were able to identify small shades of differences. An overview of the mentioned statistics is given in Table 2.

Table 2

The overview of the statistics of the build models
	CART	RaF	LR
Positive (class 0)	62
Negative (class 1)	99
Prevalence	38.51%
Predicted positive	30	48	62
Predicted negative	131	113	99
TP	21	32	37
FP	9	16	25
FN	41	30	25
TN	90	83	74
Accuracy	68.94%	71.42%	68.94%
Positive predictive value (PPV)	70%	66.67%	59.68%
Negative predictive value (NPV)	68.70%	73.45%	74.75%
False omission rate	31.30%	26.55%	25.25%
False discovery rate	30%	33.33%	40.32%
F1 score	0.4565	0.5818	0.5968
True positive rate (TPR), Sensitivity	33.87%	51.61%	59.68%
True negative rate (TNR), Specificity	90.91%	83.83%	74.75%
False positive rate (FPR)	9.09%	16.16%	25.25%
False negative rate (FNR)	66.13%	48.39%	40.32%
Positive likelihood ratio (LR+)	3.7258	3.1935	2.3632
Negative likelihood ration (LR-)	0.7274	0.5771	0.5395
Informedness	0.2478	0.3545	0.3442
Markedness	0.3870	0.4012	0.3442
Matthews correlation coefficient (MCC)	0.3097	0.3771	0.3442
Diagnostics odds ratio	5.1220	5.5333	4.3808

It is clear from the above table that we cannot determine the best-rated model in all aspects. Each of the created models has some specifics. The accuracy measure evaluates RF (71.42%) as the best model and the F1 score indicates that the best model is LR (0.5968). However, there are considerations that for unbalanced datasets (such as ours, where the prevalence of absence of narrowing on coronary arteries is 38.51%), it is better to focus on the Mathews correlation coefficient (MCC), which takes into account the ratio between positive and negative class in binary classification [27]. Also, in this case, the predominance of the RaF model is confirmed.

The CART model is characterized by the power to correctly classify negative cases (TNR − 90.91%), which in the context of our data corresponds to the presence of narrowing on the coronary vessels. On the contrary, the LR model classifies the positive cases best (TPR − 59.68%; class 0 corresponding to the absence of narrowing on the coronary vessels). However, its strength is not as significant as the strength of the CART model for coronary narrowing prediction.

The quality of the models should also be assessed according to their awareness of positive and negative cases (Informedness) and the level of confidence of the model concerning the prediction of values (Markedness). These metrics place the RaF model above the other two (Informedness 0.3545, Markedness 0.4012). The diagnostic odds ratio (DOR) is above 1 for all models.

In terms of considering the rate of correct prediction of values, the CART model achieved a higher success rate for positive cases (PPV − 70%; absence of narrowing) and the LR model for negative cases (NPV − 74.75%; presence of narrowing).

The DOR value does not indicate the strength of the individual models but suggests that they can be considered functional. Therefore, from our point of view, it would be wrong to exclude the results of any of the models. Still, we think it appropriate to consider the "performance" of individual models in the recommendations for selecting the order of significance of the monitored attributes. For this purpose, we have chosen metrics of sensitivity, specificity, and accuracy to determine the weight of the influence of individual factors. As a result, it manifests itself as the product of the value of the VI of unique attributes, sensitivity, specificity, and accuracy for each model separately. Subsequently, the sum of the weighted VI values was performed for each attribute, thus obtaining the final order of importance of the monitored attributes. We named this calculation as a new weighted agglomerative attribute importance metric. Its mathematical form is following:

\(VI= {\sum }_{i= CART, RaF, LR}norm\left({VI}_{i}\right)*{Acc}_{i}*{Sens}_{i}*{Spec}_{i},\) where norm(VI) are normalized values of VI of final models (CART, RaF, LR), Acc – accuracy of final models (CART, RaF, LR), Sens – Sensitivity of final models (CART, RaF, LR), Spec – Specificity of final models (CART, RaF, LR).

The following Table 3 provides an overview of the significance of individual attributes for particular models as well as the overall VI calculated by the formula above.

Table 3

The overall overview of the models variable importanc
Attribute	CART		RaF		LR		Overall
Attribute	VI_CART	Seq._CART	VI_RaF	Seq. _RaF	VI_LR	Seq. _LR	VI	Seq
ESC	27.4563	4	9.1835	2	2.179	9	0.5343	3
Age	9.2887	7	4.3432	5	3.8243	2	0.3970	5
Gender	38.9403	1	5.4696	3	3.7910	3	0.5858	2
F_CAD	0	-	0.2077	32	0.9193	32	0.0876	31
F_Stroke	0	-	0.4146	26	1.8694	12	0.1427	22
F_MI	0	-	1.9323	12	1.7917	14	0.1777	12
F_Hyperch	0	-	0	33	0	50	0.0342	47
F_HT	0	-	-1.3286	50	0.6542	44	0.0342	48
F_DM	0	-	-0.5568	41	0.0026	49	0.0199	50
F_AoS	0	-	0	33	0	50	0.0342	47
P_CAD	35.8718	2	4.335	6	1.6291	19	0.4266	4
P_Stroke	0	-	-0.1003	35	2.1731	10	0.1454	21
P_MI	23.1907	5	4.6572	4	0.9477	31	0.3300	6
P_Hyperch	0	-	0.4980	23	0.5375	46	0.0751	38
P_HT	0	-	-1.2422	49	0.7267	42	0.0402	46
P_DM	0	-	0.3978	27	2.2989	7	0.1648	16
P_AoS	0	-	1.272	17	1.0917	28	0.1240	24
Smoking	0	-	1.7714	13	1.8538	13	0.1768	14
S_Freq	0	-	-0.2764	37	1.4324	23	0.1020	29
S_Duration	0	-	0.4391	25	0.8313	34	0.0889	30
Alcohol	0	-	-0.9126	46	1.2303	26	0.0751	37
Weight	0	-	0.3448	29	1.2563	24	0.1088	25
Height	0	-	1.3322	15	1.7671	15	0.1610	18
BMI	0	-	2.0448	10	1.4493	22	0.1627	17
BP	6.1589	8	0.2921	30	1.4591	21	0.1516	20
Urea	0	-	-0.8061	43	0.9670	30	0.0640	41
Creat	0	-	0.2689	31	1.7142	16	0.1309	23
AST	0	-	-0.4358	39	0.7823	37	0.0639	42
Sodium	3.5982	11	1.3828	14	1.9578	11	0.1919	11
Potassium	0	-	-1.0540	47	0.6576	43	0.0414	45
Chol	0	-	0.3651	28	0.5565	45	0.0726	39
TG	3.4221	12	-1.1236	48	0.7419	41	0.0627	43
HDL	0	-	1.2960	16	1.6749	18	0.1552	19
LDL	0	-	-0.8406	44	0.1873	48	0.0222	49
CRP	0	-	0.8448	20	0.9739	29	0.1069	27
Chloride	3.8274	10	3.1369	7	0.8001	35	0.1776	13
FBG	0	-	2.0364	11	1.6987	17	0.1755	15
HIV	0	-	0	33	0	50	0.0342	47
HBs	0	-	0	33	0	50	0.0342	47
ECG_HR	0	-	-0.0269	34	0.7988	36	0.0752	36
ECG_Rhythm	0	-	2.0886	9	2.6793	4	0.2283	8
ECG_PQ	3.2619	13	0.4401	24	0.7503	39	0.1025	28
ECG_QRS	4.7327	9	0.7525	21	2.5476	6	0.2128	9
ECG_QT	0	-	-0.2473	36	1.5298	20	0.1079	26
ECG_LBBB	0	-	0	33	0	50	0.0342	47
ECG_RBBB	0	-	0.523	22	0.7434	40	0.0865	32
ECG_VES	0	-	-0.7926	42	1.2421	25	0.0788	34
ECG_SVES	0	-	-0.4026	38	1.1705	27	0.0851	33
ECG_STD	0	-	1.2300	18	0.2054	47	0.0765	35
ECG_STE	29.2528	3	10.6827	1	5.8683	1	0.7761	1
ECG_T	0	-	-0.4868	40	0.9166	33	0.0696	40
ECHO_EF	16.8490	6	2.1336	8	2.2466	8	0.2986	7
ECHO_PH	0	-	0.8875	19	2.6073	5	0.1936	10
Muscle_bridge	0	-	-0.8668	45	0.7540	38	0.0513	44

As the reader can see, the significance of the attributes is initially significantly decreasing, but gradually the displayed significance between the individual attributes decreases only slowly. All well-known RF can be seen between the most significant RF by our combined ranking formula, which confirms suitability of our calculations' [15][18]. Moreover, although not so significantly, there have been several less known RF in the foreground as well. Such examples are Sodium, Chloride, fibrinogen (FBG), and the interval length from the beginning of the P wave to the beginning of the ventricular complex in milliseconds (ECG_PQ).

D. Identifying potential new risk factors of CVD

Our goal was to calculate the importance of the individual attributes and possibly also identify new interesting relations. We focus on several RFs, where there is only a kind of awareness or doubt whether they could affect CVD. Based on this and our conclusion from previous research [14], we formulated the following research questions:

1. Is there a link between a higher level of fibrinogen and a more severe coronary finding, respectively increased cardiovascular risk?

2. Does fibrinogen have the potential to be an atherosclerosis risk factor compared to traditional RF?

In order to answer these questions, we trained new models where we have no longer considered the impact of all attributes. On the contrary, we have built models based on selected attributes only, concerning our second research target. We have chosen the selection of the method concerning the targeting of the negative class coverage, that is, in the context of our data classes corresponding to the presence of narrowing on coronary vessels. A specificity parameter that has reached the highest score for the CART model is focused mainly on this fact. In order to obtain comparable results, we retained the distribution on train and test set, as well as the distribution of records for 10-fold cross-validation. Algorithm parameters remained unchanged.

The combinations of attributes for building the models were chosen so that in some risk cases, they also take into account known factors, thus obtaining a sufficiently known as well as less known RF. Below are specific attribute settings for model selection:

1. Coronary_findings ~ P_CAD + FBG

2. Coronary_findings ~ P_CAD + FBG + HDL

As can be seen, we also included in the research attributes that are significant in the coronary risk assessment or whose effect is not apparent. For example, some sources point to its cardioprotective effect when it comes to HDL, so the higher the HDL level, the lower the risk for CAD. The ESC parameter speaks of the CVD risk classification, which also tells the CVD.

The first combination was not sufficiently clear regarding the determination of FBG levels concerning CF severity. So, the distribution of positive and negative CF patients was highly fragmented (seven different intervals of FBG values with alternation of target classes) without significant determination of CF severity (percentage ratio range of CF was approximately between 40–70%).

The combination of previous RF enriched by HDL (P_CAD, FBG and HDL) pointed out an interesting fact. The patient classification based on P_CAD and FBG values is comparable. Equally, patients are classified as coronary narrowing in the presence of confirmed CAD and patients who have not yet been diagnosed with CAD, but the fibrinogen level was above 3,875, inclusive. The remaining conclusions resulting from the described model are shown in FIGURE 4.

As is evident, despite the patient's previous absence of CAD, FBG levels may indicate whether the patient is at risk for coronary constriction (greater than 70% prediction of coronary constriction). In addition, if HDL levels are taken at the same FBG level range, higher HDL levels appear to be less likely to have coronary vasoconstriction (greater than 60% probability). With low HDL levels and lower FBG levels, there is also a higher probability of coronary vasoconstriction (over 60%).

Given the importance of the attributes within this model, it was surprising that the previous presence of CAD in the patient (VI of P_CAD − 13.99) is less significant than the attribute describing FBG (VI of FBG − 21.71) or HDL (VI of HDL − 14.20). It was also surprising that despite the low number of attributes included, the model's accuracy on the test set increased slightly (69.57%), the model's sensitivity also increased slightly (46.77%). However, the specificity decreased (83.84%).

Based on the above analysis, we can say that there exists a relationship between fibrinogen levels and cardiovascular risk. However, we cannot say with certainty whether the relationship of direct proportion applies, and thus that high levels of FBG lead to severe CF. For this reason, we performed the same experiments on the dataset with the original classification of the CF (class 0–5 depending on the increasing severity of the CF). The first combination of RF containing only previously diagnosed CAD and FBG levels was again quite unclear in its conclusions. Despite dividing FBG into five intervals, it was not demonstrably possible to determine whether a higher FBG level could be related to a more severe finding on the coronary vessels.

By applying the second combination of attributes (P_CAD + HDL + FBG), they obtained a relatively branched fibrinogen and HDL intervals tree. If patients had not previously been diagnosed with CAD, the model classified almost half of the cases into a class without a CF. If CAD had been previously diagnosed, the result of the CF varied depending on the different intervals for FBG and HDL levels. The following FIGURE 5 describes the percentage representation of each class for different intervals of FBG and HDL levels.

CART decision tree branch for P_CAD = 1 (presence of earlier diagnostics of CAD)

Before concluding this figure, we must again note vague claims about the relationship between HDL and cardiovascular risk. A higher HDL level is assumed to have a cardioprotective effect, so the cardiovascular risk level should be lower (in our case, we derive the cardiovascular risk level from the severity of the CF). Therefore, we focused on both HDL and FBG levels [28].

We start from the upper left corner of the FIGURE 5 (lower HDL level, higher FBG level). The first cell indicates a higher prevalence of more severe stenosis at high FBG levels (above 4.475) and low HDL levels (below 1.135). Once the FBG level decreases < 3.425; 4.475), the overall percentage of more severe CFs will also decrease. An increase or decrease is also evident at the same FBG level but a different HDL level (cut of value 1.098), increasing a negative coronary vascular stenosis. We see a similar phenomenon at FBG levels in the interval < 3.2; 3.425), based on the lower or higher HDL level (cut of value 1.345) is the difference in CF. Based on the experiments performed, it might seem that even a high HDL level is not entirely appropriate. However, it may have some significance in reducing cardiovascular risk. A good example is the cells on the right side of the image. With the same HDL level (above 1.632) and a different FBG level, there is a noticeable difference in the severity of CF. At a higher FBG value (above 3.9745), the overall rate of worse findings is higher than at a lower FBG value, where, on the contrary, there is a lower prevalence of adverse findings on coronary vessels. Thus, it can be inferred that both FBG levels and HDL levels can affect the severity of coronary vascular stenosis.

Based on the results of the experiments, we can provide answers to our research questions:

1. Is there a link between a higher level of fibrinogen and a more severe coronary finding, respectively increased cardiovascular risk?

Fibrinogen alone confirms a given direct dependence with increased cardiovascular risk and the severity of CF. But is not as significant as when the strength of evidence with other parameters is potentiated.

2. Does fibrinogen have the potential to be an atherosclerosis risk factor compared to traditional RF?

Fibrinogen has the potential to be considered of interest in RF atherosclerosis, but further research on larger patient samples is needed to confirm its equivalence to traditional RF.

More than 4 million people die each year in Europe from cardiovascular diseases. The prevalence of coronary heart disease will not significantly reduce progress in CVD treatment every year. Significant problems are in primary prevention and early detection of subclinical forms of atherosclerosis. However, coronary artery disease and preventive medicine screening could be better, confirming data from Euroaspire I-V programs [15]. Many risk factors predict coronary heart disease and have different sensitivity and specificity [29]. The Framingham risk score table or the European SCORE [30] are well-known. Developing technologies and available options in conjunction with artificial intelligence and machine learning methods provides various options, whether we focus on determining the increased risk of CVD (primary prevention). One of the aims of our study was to quantify the sensitivity and specificity of no traditional risk factors of CVD and compare it with well-known risk factors. Does increased cardiovascular risk increase fibrinogen levels? Several prospective epidemiological studies convincingly show elevated fibrinogen to represent a significant, independent cardiovascular risk factor, but there are also many clinical trials with not significant results of cardioprotective low fibrinogen levels. Understanding the mechanisms that might be involved in the atherothrombogenic action of fibrinogen is also fragmentary. Fibrinogen strongly affects blood coagulation, blood rheology, and platelet aggregation. In addition, it has direct effects on the vascular wall and is a primary acute-phase reactant. These phenomena might constitute pathophysiological mechanisms involved in the association between fibrinogen and cardiovascular events. Their relative importance is unclear at present [31][32][33]. Our study detected a positive correlation between fibrinogen levels and increased cardiovascular risk. Patients with increased fibrinogen levels have severe selective coronarography results. There are no significant evidence base medicine results worldwide. Also, there is a limitation of the patient population in our study, but many spall study populations confirm that increased fibrinogen levels positively correlate with coronary artery risk [34]. Does fibrinogen have the potential to be a significant atherosclerosis risk factor? Results of our study show it to us, but our population has size limitations. Analysis of the respective studies suggests that fibrinogen is an essential and independent cardiovascular risk factor, clearly associated with conventional risk factors and genetic polymorphisms. Whether or not fibrinogen is causally involved in atherothrombogenesis remains to be determined. Fibrinogen has emerged as an important cardiovascular risk marker despite unsolved issues waiting for conclusive answers [32].

Over 50 years, the inverse association between HDL cholesterol concentrations and risk of atherosclerotic cardiovascular disease has been observed in case-control and prospective cohort studies. Evidence from human genetics and randomized clinical trials over the last 13 years indicates that concentrations of HDL cholesterol do not appear to be a viable future path to target therapeutically for the prevention of CVD [35]. Our results confirmed worldwide results. We detected any correlation between HDL levels and coronary artery disease. Patients with elevated HDL levels have no cardioprotective effect, and a group of these patients has the same frequency of severe selective coronarography results as also patients with decreased HDL levels [36].

The following hypothesis was based on relationships between fibrinogen and HDL levels and cardiovascular risk. The study population was divided into groups by the levels of fibrinogen and HDL. We detected that the group with increased fibrinogen levels and HDL levels had increased cardiovascular risk. The next group has detected low fibrinogen levels and low HDL levels. There was no significant result. High fibrinogen levels and low HDL levels significantly correlate with increased cardiovascular risk. The prevalence of severe coronary artery disease was significant. In groups with a high HDL level and low fibrinogen level, patients were prevalent without detection of coronary artery disease. Those results show the possibility of HDL cardiovascular protection and a positive correlation between low fibrinogen levels and typical coronarography results with a low prevalence of coronary artery disease. The same design of the study was not published yet. One small clinical study tries to detect the relationship between glycated hemoglobin (HbA(1c)), fibrinogen, and HDL-cholesterol (HDL-c) on cardiovascular disease in type 2 diabetes. Results showed relationships between HbA1c and fibrinogen and HDL-c and between HDL-c and fibrinogen were significant only in CVD-positive patients [37].

The presented work describes the design of our methodology for determining the significance of attributes on the experimental group of patients. It uses the results of three implemented machine learning algorithms, CART decision tree, Random Forest, and Logistic regression. We focus on determining the significance of attributes with the appropriate "point" rating. Subsequently, we weighted the acquired significance using individual models' specificity, sensitivity, and accuracy. Based on the sum of the weighted significance of the attributes, we compiled the order of the monitored patient characteristics. We named it as a weighted agglomerative attribute importance metric.

Based on the comparison of our results with the known conclusions of the significance of the observed characteristics in the diagnosis of cardiovascular diseases, we confirmed the accuracy of our calculations to determine the order of significance of attributes in the monitored group of patients.

As next, we formulated three research questions that focus on identifying potentially new risk factors that may help determine the severity of cardiovascular risk. The questions also included determining the significance of new risk factors compared to traditional ones. Our research aimed to attribute fibrinogen levels in the patient's blood. We have shown a positive correlation between fibrinogen levels and increased cardiovascular risk. Patients with severe coronary artery disease have higher fibrinogen levels, and the same low levels of fibrinogen show us it decreased cardiovascular risk. The hypothesis of this relationship could be explained by the inflammatory theory of atherosclerosis (fibrinogen is an inflammatory moderator). The next part of our hypothesis was based on a relationship between fibrinogen and HDL levels and cardiovascular risk. The study population was divided into groups by the levels of fibrinogen and HDL. We detected that the group with increased fibrinogen levels and increased HDL levels had increased cardiovascular risk (severe stenosis on coronary arteries detected by selective coronarography). There are places to discuss the cardioprotective effect of HDL and the impact of fibrinogen as a cardiovascular risk factor. The next group has detected low fibrinogen levels and low HDL levels. High fibrinogen levels and low HDL levels significantly correlate with increased cardiovascular risk. The prevalence of severe coronary artery disease was significant. In groups with a high HDL level and low fibrinogen level, patients were prevalent without detection of coronary artery disease. Those results show the possibility of HDL cardiovascular protection and a positive correlation between low fibrinogen levels and typical coronarography results with a low prevalence of coronary artery disease.

Finally, we must add that given the above, fibrinogen can be considered of interest in risk factors atherosclerosis, but further research on larger patient samples is needed to confirm its equivalence to traditional risk factors.

CVD – cardiovascular diseases; RF – risk factors; CART – Classification and regression tree; PM – personalized medicine; CHD - coronary heart disease; AIC - Akaike information criterion; LASSO - Least Absolute Shrinkage and Selection Operator; GBFS - gradient boosted feature selection; VI – variable importance; FA – factor analysis; SC – selective coronarography; FBG – fibrinogen; Creat – creatinine ; F_CAD - family history of coronary artery disease; F_MI – family history of myocardial infarction; P_MI – personal history of myocardial infarction; P_CAD – personal history of coronary artery disease; P_Stroke – personal history of stroke; S_count – count of smoked cigarettes per day; S_duration - duration of smoking in years; ECG_QRS – QRS complex; ECHO_EF – ejection fraction; ECG_RBBB – right bundle branch block; ECG_STE – presence of elevations in ST segment; RaF – random forest; MDA – mean decrease accuracy; MDG – mean decrease gini; LR – logistic regression; TP – true positive; FP – false positive; TN – true negative; FN – false negative; PPV – positive predicted value; NPV – negative predicted value; TPR – true positive rate; TNP – true negative rate; FPR – false positive rate; FNR – false negative rate; LR+ - positive likelihood ration; LR- - negative likelihood ratio; MCC – Matthews correlation coefficient; DOR – diagnostic odds ratio; Acc – accuracy; Sens – sensitivity; Spec- specificity; ECG_PQ - the interval length from the beginning of the P wave to the beginning of the ventricular complex in milliseconds; CF – coronary findings

Institutional Review Board Statement

This investigator-initiated trial was approved by the Ethical Committee of the Faculty of Medicine, Pavol Jozef Safarik University in Kosice, and the Ethical Committee of the East Slovak Institute for Cardiovascular Diseases in Kosice.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets generated and/or analysed during the current study are not publicly available due to patient confidentiality but are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Author Contributions

Conceptualization, Z.P., O.L., and J.P..; methodology, J.P.; software, Z.P., and O.L.; validation, Z.P., D.P., and J.P.; formal analysis, Z.P., and O.L.; investigation, Z.P., O.L., and D.P.; resources, D.P.; data curation, Z.P.; writing - original draft preparation, Z.P., and O.L.; writing - review and editing, Z.P., O.L., D.P., and J.P.; visualization, Z.P.; supervision, J.P; project administration, J.P.; funding acquisition, J.P. All authors have read and agreed to the published version of the manuscript.

Fundings

This work was partially supported by the Slovak Grant Agency of the Ministry of Education and Academy of Science of the Slovak Republic under grant no. 1/0685/21 and The Slovak Research and Development Agency under grant no. APVV-17-0550.

Centers for Disease Control and Prevention. Heart Disease Facts | cdc.gov. [online] Available at: <https://www.cdc.gov/heartdisease/facts.htm> [Accessed 31 January 2022].
Euro.who.int. Data and statistics. [online] Available at: <https://www.euro.who.int/en/health-topics/noncommunicable-diseases/cardiovascular-diseases/data-and-statistics.> [Accessed 31 January 2022].
Zhao, D., 2021. Epidemiological Features of Cardiovascular Disease in Asia. JACC: Asia, 1(1), pp.1–13; doi: 10.1016/j.jacasi.2021.04.007.
Institute for Health Metrics and Evaluation. GBD Results Tool. [online] Available at: <http://ghdx.healthdata.org/gbd-results-tool> [Accessed 31 January 2022].
Who.int. Cardiovascular diseases (CVDs). [online] Available at: <https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)> [Accessed 31 January 2022].
Hsu, W., 2018. A decision-making mechanism for assessing risk factor significance in cardiovascular diseases. Decision Support Systems, 115, pp.64–77; doi: 10.1016/j.dss.2018.09.004.
Zhou, Y., Han, W., Gong, D., Man, C. and Fan, Y., 2016. Hs-CRP in stroke: A meta-analysis. Clinica Chimica Acta, 453, pp.21–27; doi: 10.1016/j.cca.2015.11.027.
Radovic, M., Ghalwash, M., Filipovic, N. and Obradovic, Z., 2017. Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinformatics, 18(1); doi: 10.1186/s12859-016-1423-9.
Onat, A., 2001. Risk factors and cardiovascular disease in Turkey. Atherosclerosis, 156(1), pp.1–10; doi: 10.1016/s0021-9150(01)00500-7.
[11] Asia Pacific Cohort Studies Collaboration, 2005. A comparison of the associations between risk factors and cardiovascular disease in Asia and Australasia. European Journal of Cardiovascular Prevention & Rehabilitation, 12(5), pp.484–491; doi:10.1097/01.hjr.0000170264.84820.8e.
Fuller, J., Stevens, L. and Wang, S., 2001. Risk factors for cardiovascular mortality and morbidity: The WHO multinational study of vascular disease in diabetes. Diabetologia, 44(S2), pp.S54-S64; doi: 10.1007/pl00002940.
Yusuf, S., Reddy, S., Ôunpuu, S. and Anand, S., 2001. Global Burden of Cardiovascular Diseases. Circulation, 104(22), pp.2746–2753; doi: 10.1161/hc4601.099487.
Sanchez-Pinto, L., Venable, L., Fahrenbach, J. and Churpek, M., 2018. Comparison of variable selection methods for clinical predictive modeling. International Journal of Medical Informatics, 116, pp.10–17; doi: 10.1016/j.ijmedinf.2018.05.006.
Pella, Z., Pella, D., Paralič, J., Vanko, J. and Fedačko, J., 2021. Analysis of Risk Factors in Patients with Subclinical Atherosclerosis and Increased Cardiovascular Risk Using Factor Analysis. Diagnostics, 11(7), p.1284; doi: 10.3390/diagnostics11071284.
De Backer, G., Jankowski, P., Kotseva, K., 2019. Management of dyslipidaemia in patients with coronary heart disease: Results from the ESC-EORP EUROASPIRE V survey in 27 countries. Atherosclerosis, 285, pp.135–146; doi: 10.1016/j.atherosclerosis.2019.03.014.
Zhou, B., Lu, Y., Hajifathalian, K., 2016. Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4·4 million participants. The Lancet, 387(10027), pp.1513–1530; doi: 10.1016/s0140-6736(16)00618-8.
Tsai, T., Hsu, P., Lin, C., Wang, Y., Ding, Y., Liou, T., Wang, Y., Huang, S., Chan, W., Lin, S., Chen, J. and Leu, H., 2020. Factor analysis for the clustering of cardiometabolic risk factors and sedentary behavior, a cross-sectional study. PLOS ONE, 15(11), p.e0242365; doi: 10.1371/journal.pone.0242365.
D’Agostino, R., Vasan, R., Pencina, M., Wolf, P., Cobain, M., Massaro, J. and Kannel, W., 2008. General Cardiovascular Risk Profile for Use in Primary Care. Circulation, 117(6), pp.743–753; doi: 10.1161/circulationaha.107.699579.
Hobkirk, J., King, R., Gately, P., Pemberton, P., Smith, A., Barth, J. and Carroll, S., 2012. Longitudinal Factor Analysis Reveals a Distinct Clustering of Cardiometabolic Improvements During Intensive, Short-Term Dietary and Exercise Intervention in Obese Children and Adolescents. Metabolic Syndrome and Related Disorders, 10(1), pp.20–25; doi: 10.1089/met.2011.0050.
Gupta, J., Mitra, N., Kanetsky, P., Devaney, J., Wing, M., Reilly, M., Shah, V., Balakrishnan, V., Guzman, N., Girndt, M., Periera, B., Feldman, H., Kusek, J., Joffe, M. and Raj, D., 2012. Association between Albuminuria, Kidney Function, and Inflammatory Biomarker Profile in CKD in CRIC. Clinical Journal of the American Society of Nephrology, 7(12), pp.1938–1946; doi: 10.2215/CJN.03500412.
Marušič, A., 2000. Factor analysis of risk for coronary heart disease: an independent replication. International Journal of Cardiology, 75(2–3), pp.233–238; doi: 10.1016/s0167-5273(00)00337-5.
Mayer-Davis, E., Ma, B., Lawson, A., D'Agostino, R., Liese, A., Bell, R., Dabelea, D., Dolan, L., Pettitt, D., Rodriguez, B. and Williams, D., 2009. Cardiovascular Disease Risk Factors in Youth With Type 1 and Type 2 Diabetes: Implications of a Factor Analysis of Clustering. Metabolic Syndrome and Related Disorders, 7(2), pp.89–95; doi: 10.1089/met.2008.0046.
Pedrosa, R., Rodrigues, R., Padilha, K., Gallani, M. and Alexandre, N., 2016. Análise de fatores do instrumento de medida do impacto da doença no cotidiano. Revista Brasileira de Enfermagem, 69(4), pp.697–704; doi: 10.1590/0034-7167.2016690412i.
Pella, Z., Milkovic, P. and Paralic, J., 2018. Application for Text Processing of Cardiology Medical Records. 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA), pp.169–174; doi: 10.1109/DISA.2018.8490631.
Kolárik, M., Paralič, J., Pella, Z., and Szalonová, L., 2022, Web based application for processing cardiology medical records, IEEE 20th Jubilee World Symposium on Applied Machine Intelligence and Informatics (SAMI 2022), (to appear).
Kuhn, M., 2022. 15 Variable Importance | The caret Package. [online] Topepo.github.io. Available at: <https://topepo.github.io/caret/variable-importance.html> [Accessed 9 March 2022].
Chicco, D. and Jurman, G., 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1); doi: 10.1186/s12864-019-6413-7.
Rader, D. and Hovingh, G., 2014. HDL and cardiovascular disease. The Lancet, 384(9943), pp.618–625; doi: 10.1016/s0140-6736(14)61217-4.
Mach, F., Baigent, C., Catapano, A., et al., 2019. 2019 ESC/EAS Guidelines for the management of dyslipidaemias: lipid modification to reduce cardiovascular risk. European Heart Journal, 41(1), pp.111–188; doi: 10.1093/eurheartj/ehz455
Conroy, R., 2003. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. European Heart Journal, 24(11), pp.987–1003; doi: 10.1016/s0195-668x(03)00114-3.
Ernst, E. and Koenig, W., 1997. Fibrinogen and Cardiovascular Risk. Vascular Medicine, 2(2), pp.115–125; doi: 10.1177/1358863x9700200207.
Canseco-Avila LM, Jerjes-Sánchez C, Ortiz-López R, Rojas-Martínez A, Guzmán-Ramírez D. Fibrinógeno. Factor o indicador de riesgo cardiovascular? [Fibrinogen. Cardiovascular risk factor or marker?]. Arch Cardiol Mex. 2006 Oct-Dec;76 Suppl 4:S158-72. Spanish. PMID: 17469344.
Tousoulis, D., Papageorgiou, N., Androulakis, E., Briasoulis, A., Antoniades, C. and Stefanadis, C., 2011. Fibrinogen and cardiovascular disease: Genetics and biomarkers. Blood Reviews, 25(6), pp.239–245; doi: 10.1016/j.blre.2011.05.001.
Pieters, M., Ferreira, M., de Maat, M. and Ricci, C., 2021. Biomarker association with cardiovascular disease and mortality – The role of fibrinogen. A report from the NHANES study. Thrombosis Research, 198, pp.182–189; doi: 10.1016/j.thromres.2020.12.009.
Kjeldsen, E., Thomassen, J. and Frikke-Schmidt, R., 2022. HDL cholesterol concentrations and risk of atherosclerotic cardiovascular disease – Insights from randomized clinical trials and human genetics. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids, 1867(1), p.159063; doi: 10.1016/j.bbalip.2021.159063.
Kontush, A., 2020. HDL and Reverse Remnant-Cholesterol Transport (RRT): Relevance to Cardiovascular Disease. Trends in Molecular Medicine, 26(12), pp.1086–1100; doi: 10.1016/j.molmed.2020.07.005.
Pacilli, A., De Cosmo, S., Trischitta, V. and Bacci, S., 2013. Role of relationship between HbA1c, fibrinogen and HDL-cholesterol on cardiovascular disease in patients with type 2 diabetes mellitus. Atherosclerosis, 228(1), pp.247–248; doi: 10.1016/j.atherosclerosis.2013.02.010.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Assessment of important cardiovascular risk factors using ML methods: a randomized controlled trial

Status:

Version 1

Abstract

Figures

I. Introduction

Ii. State Of The Art

Iii. Methodology And Experiments

Iv. Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1