In this study, we conducted an untargeted metabolomics analysis of plasma to identify both disease-specific and shared risk factors across 14 chronic conditions in the EstBB subcohort of 991 individuals. We demonstrated the value of well-phenotyped population-based biobank data for identifying predictive metabolic markers. Notably, we observed predominantly disease-specific signals rather than a widespread commonality among multiple chronic diseases. Nevertheless, in terms of common signals between studied chronic diseases, risk factors for gout, shared with T2D, AFF, and lipidemias stood out prominently, suggesting potential metabolic interactions between these conditions. We observed a decrease in the shared predictors when adjusting for prevalent comorbidities. Additionally, we showed that a high proportion of identified predictors had prior association with gut microbial composition. Importantly, our findings imply that comorbidities may contribute to the shared incident risk signature observed across chronic conditions.
The highest number of incident metabolic risk associations was identified for gout, potentially due to its high comorbidity rate and its role as a risk factor for other conditions. Gout is an arthritic condition induced by hyperuricemia leading to urate deposits in the tissues. Previous studies on gout have shown associations with metabolic syndrome and chronic kidney disease[33, 34]. We are not aware of any untargeted metabolomics studies investigating risk factors of gout. Nevertheless, previous research on prevalent gout has established connections with altered amino acid levels, perturbations in purine, glycerophospholipid, sphingolipid, and carbohydrate metabolism[35]. These findings were also partially replicated in our study. For instance, a substantial number of N-acetylated amino acids (n = 15) were unequivocally associated with an increased risk of gout. These protein degradation products have been linked to various incident chronic diseases[12]. Notably, among these amino acids, N-acetylalanine showed the highest risk for incident gout and has previously been associated with an elevated risk of renal disease, heart failure, and mortality, as well as reduced glomerular performance[12, 36]. The direct effect of N-acetylalanine on renal disease, heart failure, and mortality has been shown to be fully and partially mitigated by creatinine and uric acid, respectively[12]. Moreover, within incident gout cases, N-acetylated amino acids were highly correlated with levels of creatinine - a well-known marker of kidney function. Among carbohydrate predictors, Pietzner et al., demonstrated a similar mitigation effect by creatinine and uric acid for N-acetylneuraminate, N-acetylglucosamine/N-acetylgalactosamine, arabitol/xylitol, and erythronate, all of which were uniquely associated with an increased risk of gout in our study. This suggests connectivity between loss of kidney function and the development of gout. While our analysis accounted for the prevalence of 13 other conditions for each incident condition, further adjustments for factors such as markers of renal function (e.g., eGFR) might be necessary for a more detailed understanding of chronic disease risk in a multimorbidity setup.
Within the studied conditions, we observed a limited (8%) concurrence of metabolite incident risk factors. In contrast, Pietzner et al., reported a 65.5% overlap for metabolite predictors among 27 noncommunicable diseases, including, ten cancer types when data was sourced from hospitalization and cancer registry data[12]. This represents a crucial distinction from our study, as we not only obtained data from the aforementioned registries but also integrated EHR data from primary care and other relevant registries. For example, MacRae et al., suggested using EHRs from various registries for classification of clinical data as they reported higher age of onset of multimorbidity within the identical patient cohort when relying on information derived from hospitalizations compared to data obtained from primary care sources[37]. This suggests that relying solely on hospitalization data might result in inaccurate estimation of comorbidities, likely influencing findings of disease risks and reported interconnectivity among chronic diseases.
We noticed that, for most of the studied conditions, the prevalence of comorbidities was substantially higher in incident cases compared to controls. In response, we aimed to enhance the analysis by including baseline comorbidities information as additional covariates. This aligns with a recent study emphasizing the need for distinguishing disease-specific changes from confounders from pre- and comorbidities[9]. More specifically, Fromentin et al., employed a design that incorporated not only healthy and clinically ill individuals but also subjects with dysmetabolic morbidities, enabling the comparison of metabolic signatures across various disease states and clinical stages. Similarly, our approach aimed to disentangle condition-specific effects from the multimorbidity signal. Adjustment for comorbidities resulted in a reduction in both disease-specific and shared predictors, with the most pronounced impact observed in conditions that initially exhibited the highest number of associations, namely, gout, lipidemias, and T2D. Therefore, associations initially thought to be common might be attributed to the presence of shared comorbid conditions rather than being independent associations across various diagnoses. For future studies, a more thorough consideration of distinct comorbidity profiles could enhance the detection of risk factors[38]. We also propose that utilizing registry-based electronic health record (EHR) data could potentially be expanded to specifically select subjects at various stages of disease progression, each with their respective comorbidity profiles, and corresponding selection of appropriate controls.
We also demonstrated predominantly disease-specific associations among metabolites linked to the microbiome in the external studies we reviewed. For example, indolepropionate and 3-phenylpropionate were exclusively associated with reduced risk of lipidemias and AFF, respectively. Both metabolites have been associated with reduced chronic disease risk[12]. Also, in the same study by Pietzner et al., levels of these metabolites were not significantly mediated by any of the available routine clinical parameters, including renal markers. Dekkers et al., showed that several Eubacteriales sp. and more specifically, Faecalibacterium prausnitzii species are positively associated with aforementioned metabolites[25]. In addition, multiple associations of microbially produced indole-derived metabolites and reduced risk of chronic diseases were observed. Contrastingly, previous studies have linked indoxyl sulfate with further progression of chronic kidney disease and cardiovascular disease[39]. Therefore, it could be hypothesized that in individuals with normal renal function, maintaining optimal levels of uremic toxins could protect against cardiovascular issues.
Previous research indicates that unidentified compounds contribute significantly to the variability in gut microbial profiles among individuals. Importantly, also in our study, a noteworthy proportion of these metabolites demonstrated associations with an increased or decreased risk of chronic diseases. It is worth noting that our results may warrant reevaluation, once these unknown metabolites have been identified and their role elucidated. It is also important to consider that the assessment of the status of microbiome-metabolite associations was based on previous research which analyzed the explained variance of blood metabolites. However, the variance explained of the fecal metabolome is much higher than that of the blood metabolome[40]. Furthermore, a recent study reported markedly superior accuracy in predicting the levels of fecal metabolites from the taxonomic profiles compared to predicting blood metabolites[41]. Consequently, the low levels of explained variance of specific blood metabolites should not be interpreted as an absence of a connection to gut microbiota. This study also emphasized that, compared to blood, there are more robust associations between fecal metabolites and prevalent cardiometabolic diseases, suggesting that fecal levels could potentially serve as a better proxy for identifying risk factors for chronic diseases modulated by the gut microbiota. Despite this, our results indicate that plasma levels of microbiome-derived metabolites serve as a proxy for incident chronic disease.
Our study is set apart from prior research on the simultaneous investigation of metabolite incident disease risk factors by multiple aspects. First, utilization of extensive registry data distinguishes our approach from studies that depend on self-reported or single registry-based data, enhancing the robustness of our findings, and by providing less biased and more objective/standardized diagnosis status of multiple chronic conditions. Second, the inclusion of a wide range of frequently occurring chronic disorders, from cardiac and metabolic conditions to mood disorders, contributes to a comprehensive exploration of risk factors, identifying significant associations for all investigated conditions. Third, alongside the conventional analysis, we adjust risk predictions for comorbidities. This results in a more nuanced evaluation of risk factors specific to diseases, as well as those shared among them. Notably, the potential confounding effects arising from comorbidities might not have been comprehensively addressed in previous studies[12]. Last, we demonstrate a high number of microbially-associated metabolites among the significant incident disease predictors. This was achieved by integrating recently published data on the explained variance of metabolite levels attributed to the gut microbiome. Through this approach, we were able to link previously established microbiome-metabolite associations to a large proportion of the significant predictors.
This investigation has certain limitations that warrant consideration. Crucially, our study encounters constraints in terms of statistical power due to small sample size in the condition-specific incident case groups. Moreover, the absence of a validation cohort could affect the generalizability of our findings. However, the specific selection of diverse conditions would require extensive collaboration with partner institutions, potentially limited by the wide scope of this study. In addition, we opted not to employ any additional inclusion or exclusion criteria specific to any particular disease. While a single ICD-10-specific definition of chronic conditions might impose limitations, it does provide a standardized and plain approach that facilitates ease of assessment and replication for broad selection of diseases. Finally, we did not account for the confounding effects of treatment or medication intakes.