Introduction
According to the Global Burden of Disease study (2017), between 1990 and 2017, disability-adjusted life years (DALYs) due to NCD increased from 1.2 to 1.6 billion. With that, NCD caused more than 60% of DALYs worldwide [4]. But NCD not only cause individual suffering but also burden society as a whole, due to massive monetary and non-monetary costs [4,5]. Relying on interventions -- no matter how effective they are -- after individuals are already ill, is therefore a pivotal fallacy. Instead, current developments require simple and inexpensive ways to identify high-risk individuals to target with both preventive and interventive approaches. Furthermore, it is increasingly becoming clear, that many well-established risk factors (such as Body Mass Index (BMI) outside the normal range [6], genetic risk factors [7,8], etc.) supposedly helping to identify individuals at high risk for certain diseases are not independently from the individual environment and do not behave the same way across different individuals; highlighting the importance of personalized, tailored approaches in the context of preventive medicine. The presence of one particular risk factor might not have much predictive character for negative outcomes without being considered systemically/holistically, that is, in the context of other physiological, environmental, psychological, and biochemical parameters and processes [e.g., 6–8]. Despite these intricacies, at the same time, disease-predictive measures should be cost-efficient making it possible to implement them in the health care system.
One particular concept that has become well-established in the literature is the concept of allostatic load (referring to the cumulative burden of chronic stress and adverse life events) with its suggested allostatic load index (ALI) [9]. ALI is a cumulative multi-system risk score based on physiological and biochemical measures [10]. For each system, risk indices are calculated as the proportion of biomarkers for which an individual falls into predefined high-risk quartiles.
As a systemic risk score, ALI is predictive for various outcomes, including all-cause mortality [11,12], while there are some critical limitations concerning its conceptualization. First, calculating a risk score as the sum of different system risk scores does not allow to account for intersystemic interactions and the possible predictive effect of these interactions. This gap is unfortunate as ALI includes parameters that indeed are not independent of each other, such as BMI and blood pressure [13]. Another concern refers to practicability and implementation of ALI into the health care system. While ALI considers parameters that can be assessed relatively simple, it is still likely that, for most individuals, parameters are only partially available, limiting the predictive power of ALI. Together, ALI is a profound concept but artificially splits physiological processes that are woven into a holistic allostatic reaction, as acknowledged by the developers of ALI [14]. Furthermore, ALI lacks practicability, which is underlined by the fact that, to date, ALI has not been implemented in routine diagnostics.
Given the rising number of NCD, there is an urgent necessity to develop an approach that is practicable, cost-efficient, and at best, based on biomarkers that are assessed in clinical routine allowing to identify high-risk individuals to target with specific preventive steps. The current study aimed to develop and validate an easily accessible measure that can realistically be implemented in routine diagnostics. Towards this aim and building on ALI, five biomarkers were chosen as they cover broad physiological functionality; CRP, fibrinogen, and IL-6 are pro-inflammatory markers (i.e., positive association with inflammation), cortisol as the end product of the hypothalamus-pituitary-adrenal axis is an immune-modulatory mediator playing a crucial role in stress response, and creatinine is important for cellular energy metabolism [15–19]. Contrary to ALI, employing a clustering approach based on these biomarkers allows to account for linear and non-linear interactions among them and to link resulting clusters to depression and a range of somatic diseases. To examine the association between biochemical clusters and diseases, we focused on depression, heart disease, hypertension, stroke, PUD, and cancer as these represent globally highest prevalence, the fastest increase in numbers, and utmost comorbidities [4]. We first clustered biochemical markers and related them to odds ratios (ORs) for diseases in a U.S. population sample and then repeated this process in a Japanese cohort. To ensure representativity, both samples were recruited via random-digit-dialing qualifying them for studies with results generalizable to the population. Towards our aim to ensure that the selected biomarkers and their clustering demonstrates robust applicability across different cultures and ethnicities [20], we chose one U.S. American and one Japanese sample to generate and validate the biochemical clusters.
Methods
Collection of Biosamples and the Assessment of Biochemical Markers
MIDUS. Blood samples were collected after overnight fasting for the assessment of CRP, IL-6, and fibrinogen, according to the manufacturer guidelines (Dade Behring Inc., Deerfield, IL for CRP and fibrinogen; R&D Systems, Minneapolis, Minnesota for IL-6) [20]. Plasma levels of CRP and fibrinogen were assayed using immunonephelometric assay; IL-6 was quantitatively assessed using Enzyme-Linked Immunosorbent Assay (ELISA). The laboratory inter-assay coefficient of variance was 5.7% for CRP, 13% for IL-6, 2.6% for fibrinogen, all below the 20% acceptable range [21].
To obtain a cumulative cortisol and creatinine measure 12-hour overnight urine samples were also collected between 7 PM and 7 AM. Enzymatic Colorimetric Assays and Liquid Chromatography-Tandem Mass Spectrometry were performed at the Mayo Medical Laboratory in Rochester, Minnesota. Data were excluded if participants had a renal failure or severe renal decline according to glomerular filtration rate [21].
MIDJA. CRP, IL-6, and fibrinogen were assessed analogically to MIDUS, while cortisol was assessed in saliva (three subsequent days, three times each day) and creatinine was assessed in blood. The 9 saliva measurements were averaged and used as a representative marker for cortisol concentrations [22]. We used blood levels of creatinine.
Diseases
Depression, heart disease, hypertension, stroke/Transient Ischemic Attack (TIA), PUD, and cancer were assessed via self-report. Participants were asked if they were diagnosed with any of these diseases at timepoint of study participation.
Statistical Analyses
First, the potential collinearity of the biomarker levels was assessed by calculating Pearson correlations among CRP, fibrinogen, IL-6, creatinine, and cortisol. After randomizing the order of participants [23], we performed a k-mean cluster analysis with these markers in the MIDUS sample using IBM SPSS Statistics 27. To ensure the stability of clusters, we repeated the clustering process in subsamples [23]: Specifically, we conducted a median-split based on age and performed the clustering for each group separately to assess whether the clusters are age-dependent. For the same purpose, we repeated the clustering procedure after excluding participants with a BMI outside the health range (below 18 or above 35). The next step was to repeat biochemical clustering, that was performed for the whole MIDUS sample, in the MIDJA cohort. Finally, z-tests were used to compare ORs for diseases among clusters.
Results
Preliminary Analyses
In both MIDUS and MIDJA samples, biomarkers were positively correlated (see SI Tables 4 and 5).
In MIDUS, 24.1% of the participants (currently or previously) had depression, 11.5% heart disease, 37.1% hypertension, 4.3% stroke/ TIA, 5.3% PUD, and 13.6% cancer. In MIDJA, 4.5% of the participants had depression, 5.6% heart disease, 19.3% hypertension, 1.1% stroke/TIA, 8.3% PUD, and 5.1% cancer.
K-Mean Clustering
We used z-standardized biomarkers for k-mean clustering and evaluated the clustering results from k = 2 to 6 clusters for MIDUS. When k = 2, the patterns of clusters were not distinct enough; when k = 4 or above, some clusters were very small in size (i.e., smallest cluster portion: 8%). Through a combination of the parsimonious principle and engineering meaningful difference among clusters, k = 3 were selected for the subsequent analyses. Figure 2 illustrates the distributions of the three identified clusters with respect to the biochemical markers. We replicated all three clusters in the younger MIDUS cohort as well as clusters 1 and 2 in the older MIDUS cohort (SI Figures 7 and8). We further replicated all three clusters in the BMI-restricted MIDUS cohort (SI Figure 9).
Then, the 3-cluster solution from MIDUS was validated in the MIDJA sample; the results are shown in Figure 3.
As depicted in Figures 2 and 3, cluster 1 is characterized by average levels in all biochemical measures. Cluster 2 is characterized by high and above-average levels oforCRP, IL-6 and fibrinogen. Cluster 3 is characterized by high and above-average levels for cortisol and creatinine but average levels for CRP, fibrinogen, and IL-6.
Associations between biochemical clusters and disease states
MIDUS. Cluster 2 had the highest ORs for all considered diseases compared to the clusters 1 and 3 (Figure 4, SI 10).
MIDJA. Cluster 3 had the highest ORs for heart disease, hypertension, and PUD, cluster 2 had the highest ORs for stroke and cancer, and cluster 1 had the highest ORs for depression (Figure 5).
To compare this cluster-based approach to a well-established clinical biomarker that is associated with a broad range of NCD, the number of diagnoses among individuals in cluster 2 was compared to the number of diagnoses among individuals with CRP concentrations above the clinical cut-off (>3mg/L) [24]. The disease burden in cluster 2 was higher with 1.6 diagnoses (SD=1.16; 0.9 diagnoses for individuals not assigned to cluster 2) compared to individuals above the CRP-cutoff with 1.2 diagnoses (SD=1.07; 0.9 diagnoses for individuals below the cutoff).
Discussion
Findings reveal three distinct and interculturally stable biochemical clusters observable in the general population. Cluster 1 is characterized by average levels of all biomarkers, cluster 2 by high inflammation-related mediators coupled with low cortisol and creatinine, and cluster 3 by high levels of cortisol and creatinine. The stability of clusters is supported by their replication in the MIDJA sample as well as in the BMI-restricted, in the younger (below age median) and in the older MIDUS cohort (above age median; here only clusters 1 and 2 were replicated). However, we did not replicate cluster 3 in the older MIDUS cohort. One explanation could be that, due to an age-related increase in systemic inflammation [25], older individuals were not assigned to cluster 3, which is characterized by low inflammation.
Relating clusters to diseases, in MIDUS, cluster 2 showed the highest ORs for depression, heart disease, hypertension, stroke, and cancer (Figure 4). These findings are supported by previous evidence suggesting that CRP, IL-6, and fibrinogen are associated with depression [26,27], coronary heart disease [28–31], blood pressure [32], stroke [33–35], and cancer [36,37]. However, contrary to these previous studies, the clustering approach used in this study allowed to account for well-known collinearities between biomarkers and thus promotes a more holistic perspective. Specifically, findings build on previous studies suggesting a link between inflammation and diseases [25] by demonstrating that it might not be one specific biomarker but a specific biochemical pattern (i.e., high CRP, IL-6, fibrinogen coupled with low cortisol and creatinine) that is associated with diseases. This idea is supported by the observation that individuals in cluster 2, descriptively, indicate a higher disease burden than individuals above the clinically well-established CRP cutoff.
Interestingly, we found no differences in the ORs for PUD between clusters despite the role of inflammation in its pathology [38]. Future research may aim to further examine the role of inflammatory signaling in the pathology of PUD.
While the cluster with high levels of CRP, IL-6, and fibrinogen can be considered a high-risk cluster, cluster 3 with high levels of cortisol and creatinine but low inflammation may be considered a protective cluster in MIDUS. We found that ORs for most diseases were lower in cluster 3 compared to the high-risk cluster but also as compared to cluster 1 with average levels of all biomarkers. Concerning cancer, this difference became significant, potentially suggesting a protective character of this cluster. This would be in contrast to studies suggesting a link between hypercortisolism and disease outcomes [39,40]. However, the combination of low inflammation and high cortisol and creatinine as in cluster 3 might indicate the integrity of the glucocorticoid negative feedback system, protecting from negative health outcomes [41]. Longitudinal studies may examine the consequences of this specific biochemical pattern. Towards this aim, we will examine MIDUS follow-up data (10 years after biomarker assessments) with respect to mortality outcomes.
In MIDJA, cluster 2 only seems to be a high-risk cluster for stroke and cancer while for other considered diseases, cluster 1 or cluster 3 indicate the highest burden. One aspect to consider here is that the MIDJA sample (N=378) and especially cluster 2 were very small in size (N=30). It is, therefore, possible that the present findings lack reliability. However, different biochemical patterns may be associated with different outcomes in the Japanese compared to the U.S. American population because moderating mechanisms such as BMI, nutrition, and medication differ between populations [41]. This idea is supported by the finding that although in both MIDUS and MIDJA, approximately 8% of participants were assigned to cluster 2, the disease burden in MIDJA was much lower compared to MIDUS. This highlights the importance of individual aspects in disease susceptibility mentioned above and the role of interactions among different cultural, lifestyle and biochemical factors; while an assignment of a U.S. American individual to cluster 2 might be associated with a high disease burden, this might not be the case for a Japanese individual with the similar biochemical profile. Future studies should aim to examine the found biochemical clusters in other cultural contexts promoting a better understanding of their associative and predictive character in multiple populations. From a preventive perspective, this may also help to further precise targeted prevention, that is, to better understand which biochemical profile is associated with what disease susceptibility under what conditions.
Limitations. Our work has several strengths such as the validation of the clusters in an independent, Japanese sample and the representative character of cohorts. Yet, the findings face limitations. First, the present study is cross-sectional not allowing causal inferences. Second, the MIDJA sample size was relatively small. It is, therefore, possible that ORs lack reliability. Third, methodological inconsistencies (urine cortisol and creatinine levels in MIDUS, average saliva levels of cortisol and blood levels of creatinine in MIDJA) between the cohorts may have impacted the clustering process. Forth, diseases were assessed via self-report, which bears the risk of a report bias.
Conclusion. While the interactions among biomarkers make the distinction of their outcomes challenging, the design of the current study helps to gain a better understanding regarding the biochemical patterns are present in the general population and how these patterns contribute to different physiological states on a systemic scale. We identified and replicated three distinct biochemical signatures in two mid-life populations including one cluster with collinearly occurring elevated levels of CRP, fibrinogen, and IL-6 as well as low levels of cortisol and creatinine that indicated the highest prevalence of stroke and cancer.
Future longitudinal studies should aim to test the predictive character of the clusters found in this study, because, if clusters are indeed predictive in terms of risk evaluation, then they would represent a valuable clinical tool for both diagnostics and prevention of diseases. Specifically, if high-risk individuals can be identified by the clustering approach presented here, then these individuals could be provided with personalized treatment options including psychotherapy, e.g., in cases where CM is prevalent, anti-inflammatory drugs, or treatment supplements, e.g., nutrition and exercise plans.