Early diagnosis of degenerative diseases can improve treatment outcomes and reduce healthcare costs. However, finding reliable biomarkers is difficult due to several factors, such as low statistical power, selection bias, and biological variability1. Studies seeking to identify disease biomarkers often use cross-sectional designs, comparing samples from patients and healthy controls at a single time point2. This approach has several limitations, including selection bias, which may lead to non-validating studies and is caused by bias in age, lifestyle, medication, or comorbidities. Studies utilizing post-diagnosis samples often introduce biases due to treatment effects, lifestyle changes, or psychological stress, which complicate both the definition of the case group and the construction of suitable control groups. Biomarkers identified in symptomatic patients might reflect non-specific inflammation rather than distinct disease pathology, raising the question of whether control groups should include individuals with other conditions. Finally, day-to-day variability, influenced by factors such as diet, lifestyle, and circadian rhythm, contributes to inter-individual variance, further impeding biomarker discovery.
To overcome these challenges, we introduce longitudinal and pre-diagnostic samples from the Danish Blood Donor Study (DBDS), focusing on participants later diagnosed with osteoporosis3. The DBDS encompasses over 2.7 million plasma samples from more than 165,000 healthy donors included since 2010, and has previously been used for longitudinal studies4. By linking the DBDS data with national health registries, we identified participants who were diagnosed with osteoporosis after their first donation. We then selected their pre-diagnostic samples, as well as samples from matched controls who did not receive an osteoporosis diagnosis within one year of the last donation. We matched cases and controls based on sex, age, sampling site, analytical batch, and plasma sample storage time, excluding participants with certain disease and medication histories before becoming donors. For each participant, we used up to six samples spaced 8 to 14 months apart (Fig. 1a). The latest sample was collected within one year before diagnosis for the cases, and within the same period for the controls. The demographic characteristics of the participants are summarized in Table S1.
In total, we included 78 participants (39 cases and 39 controls) with longitudinal data and 120 participants (60 cases and 60 controls) with single-sample data. Approximately 80% of the cases continued to donate blood after diagnosis, and around 96% of these were diagnosed without a fracture event. We performed untargeted metabolomics on the plasma samples using liquid chromatography-mass spectrometry (LC-MS) in both positive and negative ionization modes. This approach allowed us to capture the signatures of thousands of metabolites without prior knowledge of their identity or function, enabling a comprehensive and unbiased exploration of the metabolome.
The LC-MS metabolomics analysis yielded metabolomic profiles of 3221 and 1429 features in positive and negative ionization after data cleaning and normalization (Fig. S1-S2). Principal component analysis (PCA) of the metabolomics data revealed no strong patterns of osteoporosis in the first two principal components or the following four components (Fig. 1b and Fig. S3). Consistent with previous findings, we did not expect osteoporosis to be a major source of variance in the metabolome5, 6. Additionally, we found that within-participant variance was on the same scale as between-participant variance in the first two principal components, impeding identification of between-participant differences (Fig. 1c). Notably, later components explained the between-individual variance in the negative ionization data (Fig. S4).
We then built predictive models based on two different strategies: a single-sample model and a longitudinal model. The single-sample model used the most recent sample from each participant (n = 198 participants) and compared the metabolite levels between cases and controls. The longitudinal model used all the available samples from each participant (n = 78 participants, 420 samples) and compared the metabolite changes over time (Fig. S5). The metabolite changes were represented as feature slopes for each individual rather than the relative feature abundance, thus normalizing inter-individual effects on abundance.
We used elastic net regression and random forest to evaluate the effects of non-linearity and interactions. The elastic net model performed the best in all settings, but its performance was comparable to random in the single-sample model (Fig. 2a, Fig. S6). This indicated insignificant metabolite differences between cases and controls at the latest time point. However, the longitudinal elastic net model yielded significant AUCs of 0.75 and 0.68 in negative and positive ionization mode (Fig. 2b). This suggests that the longitudinal setup identifies metabolites that change over time and captures the dynamic signal of osteoporosis progression (Fig. 2c-d).
The longitudinal elastic net model selected 52 features in positive ionization mode and 42 features in negative ionization mode, having non-zero coefficients (Extended Data 1). We identified and annotated 24 of the selected features across both datasets (Fig. 2c-d and Extended Data 1). Here, several compounds including amino acids (derivates), bisphenol A, cortisol, and hippuric acid have previously been found to associate with bone metabolism and osteoporosis7–12.
For hippuric acid, we observed a time dependent increase in the heatlhy controls, and a decrease in the cases. This corresponds with the litterature, which states that hippuric acid increases with age but remains low for individuals with degenerative diseases or frailty11. Also, hippuric acid has been shown to inhibit osteclast differentiation, resulting in decreased bone resorption and higher bone mineral density (BMD)12. Bisphenol A binds to the estrogen receptor and is an exogenous driver of osteoporosis. In the longitudinal setting, the contribition of bisphenol A might be explained by its deposition in adipose fat tissue8, 9, 13. Hence, depositioned exogenous compounds might serve as biomarkers, if they prove stable to day-to-day variance. Finally, amino acid metabolites have previously been identified in other studies as associated with osteoporosis in agreement with our identified amino acid derivates6, 7, 14.
These findings suggest that the longitudinal model uses biologically relevant features for predicting osteoporosis. Some of the metabolites remained unknown or unconfirmed, and further efforts should be made to elucidate their identity and function.
Comprising 78 participants with longitudinal samples and several thousand untargeted molecules, our pilot study faces challenges due to its sample size. Although the vast number of molecules leads to overfitting, we found that the method validates across independent batches in hold-one-batch-out cross-validation with an AUC of 0.68 in the negative ionization dataset (Fig. S7). We attribute the small performance loss to reduced training data when more data is held out. The validation indicates that the method is stable and suggests that more data might improve performance. However, we cannot conclude whether the predictors and models will pass biological validation on new cohorts.
The method validation is partially explained by the longitudinal setup, which improves the chances of selecting robustly measured features across individuals. Untargeted metabolomics is susceptible to technical variance (Fig. S1), meaning that any method improving stability also enhances the likelihood of identifying novel biomarkers. We found that batch effects and storage time accounted for the most variance in the data, but we regressed out these effects (Fig. S2). Other methods, such as proteomics, where platforms like OLINK have matured, might also provide results robust to technical variation15.
The difficulty in distinguishing between cases and controls could be explained by similar pathogensis progression among controls. A majority of participants in the case group (80%) remained healthy enough to continue to give blood after receiving a diagnosis of osteoporosis. As age increases, BMD decreases, and with a lifetime risk of osteoporosis exceeding 10% in the Danish population, it is likely that some controls also experience osteoporosis progression without being diagnosed16, 17. This suggests that BMD might not differ substantially between cases and controls. Unfortunately, BMD measurements were not available.
Given the progressive nature of osteoporosis, classification models cannot represent the true progression state. We suggest interpreting the model probabilities as osteoporosis acceleration scores rather than binary outcomes. Acceleration scores offer a means for individual feedback on healthy aging and reduce the model performance demands. Established markers such as cholesterol and blood pressure guide lifestyle modifications to prevent cardiovascular diseases18. Similarly, osteoporosis progression scores could promote physical activity to preserve BMD in the aging population19.
Despite our thorough study design, the findings may not generalize to high-risk patients assessed in an outpatient clinic. Instead, the identified biomarkers in the current study, which used a design with asymptomatic cases and controls, propose a potential use in screening an asymptomatic population. In such a setting, diseases covarying with osteoporosis might bias the model, meaning disease specific acceleration scores should also rely on appropiate controls with e.g. orthopedic or degenerative diseases.
To our knowledge, only one other study has performed deep longitudinal sampling using untargeted metabolomics, aiming to describe aging profiles20. Other longitudinal studies have used few timepoints or short sampling periods to understand e.g., diabetes induced polyneuropathy21, Alzheimer’s22, and general health profiles23 – albeit with great success, we believe more samples and follow-up years may provide deeper insights into the disease progression. We conclude that this study warrants more research in the Danish Blood Donor Study, which has thousands of participants, some with longitudinal data for over ten years.