Structure, function and dynamics of the human gut microbiome are generally studied in cross-sectional or short-term longitudinal settings. Contemporary microbiome variation is partially explained by host variables such as age, gender, stool consistency/transit time, health status, diet, and medication1. However, the gut environment is a dynamic ecosystem, continuously perturbed by daily dietary intake and egestion or occasional exposures to medication and disease5. Isolated events as well as long-term lifestyle choices can permanently alter the microbiome2, yet long-term temporal effect are understudied. A recent study revealed that diet only allowed prediction of future microbiome up to two days after food consumption3. On the other hand, incomplete recovery of the original gut microbiota even after six months of antibiotic exposure implies that, when strong enough, perturbation effects can last for longer terms4. As host health and lifestyle continuously impact the microbiome environment over time, a consistent, enduring collection of host data is necessary to study long-term cumulative effects of life history, especially for long-lived hosts like humans.
Here, we capitalized on the community-based North-Italian Bruneck Study cohort (n=304), which prospectively collected long-term individualized host metadata (i.e. food intake, lifestyle, medication, blood chemistry and clinical assessments) over 26 years (1990 to 2016) in 5-year intervals6. Fecal samples collected in 2016 (age 65-98) were subjected to quantitative microbiota profiling (QMP) enabling association of current absolute microbiome abundances with historical metadata7. Using this unique dataset, we explored (i) the aging microbiome, combining Bruneck and Belgian Flemish Gut Flora Project (FGFP; n = 2,215; aged 18 to 85) population cohorts, (ii) associations of historical variables and current microbiome, and (iii) predictive capacity of lifestyle history on current microbiome.
Microbiota diversity change with age is a debated subject. Prior assessments of alpha-diversity during aging gave conflicting results (with both positive8,9 and negative10,11 associations) attributed to confounding by residence type9 and/or frailty12. In our combined cohort (age range 18-98, combined n=2,519; relative microbial profiling (RMP)), we observed a significant positive correlation between age and microbial richness (Spearman rho range 0.12 to 0.18, p < 0.05; Figure 1a). As only three subjects were in nursing homes in the combined cohort, the positive correlation observed here is likely independent of residence type. Further analyses of Bray-Curtis dissimilarities found elderly (age≥65) community composition significantly different from young adults (18≤age<30) and adults (30≤age<65; pairwise Adonis R2 [0.003:0.01], FDR = 0.001; Extended Data Fig. 1a and Supplementary Table 1). Taxa increasing with age were mostly from the rare tail of the abundance curve (Extended Data Fig. 1b and Supplementary Tables 2, 3). On the contrary, top-ranked core taxa such as Faecalibacterium, Roseburia, and Anaerostipes were negatively correlated with age in elderly (Spearman rho [-0.28:-0.27], adjusted for gender, BMI, stool moisture, and antibiotics intake, FDR<0.1; Figure 1b and Extended Data Fig. 1c, d). Reduction of these major butyrate-producers might suggest unfavorable metabolic changes in the elderly gut.
We next studied whether the distinct elderly gut microbiome could be due to cumulative exposures to various perturbations. Prior to the determination of long-term life-history effects on the elderly microbiome, we quantitatively investigated contemporary microbial community covariates in the Bruneck Study cohort based on a db-RDA analysis after removing collinear variables (as in 1; Supplementary Table 4). Highly ranked contemporary covariates linked with microbial community composition were qualitative and quantitative proxies of transit time like current stool moisture, defecation frequency, hard stools, and obstipation (db-RDA, adjusted R2 [1.5:2.4%], FDR < 0.1; Figure 2b, and Supplementary Table 5a). These variables were consistently found associated with microbial community composition in different age groups (Extended Data Fig. 2a), and previously identified as top covariates in the FGFP cohort1. In total, we observed a 7.0% non-redundant cumulative effect size from 11 significant contemporary covariates (db-RDA, FDR < 0.1; Supplementary Table 5a) using a forward stepwise redundancy analysis (Supplementary Table 5b), in line with estimates in adult populations1. Analysis of combined effect size of significant covariates per category revealed that current health parameters (e.g. liver stiffness) and bowel habits (stool moisture, defecations, and hard stools) had similar (redundant) cumulative explanatory power for elderly community variation (12.4 to 12.8% of variation; Supplementary Table 6a). Total additive effect size including further categories (medication, diet, anthropometric feature, and lifestyle) was 16.6%.
Next, we sought to assess the influence of the extensive array of historical parameters collected during prior Bruneck Study evaluations (1990-2016) on the current microbiome composition. We first performed a db-RDA as above using the historical parameters of each year as explanatory variables (Supplementary Table 7a), identifying historical variables contributing significantly to a cumulative model that also included present variables (Supplementary Table 7b). Overall, significant historical variables were mostly linked to beta blocker, blood parameters, and diet (adjusted R2 [0.60:0.80%], FDR<0.1). Interestingly, inclusion of these significant historical parameters of each evaluation (db-RDA, FDR<0.1; Supplementary Table 7a) significantly increased cumulative non-redundant effect size to 8.5% (likelihood ratio test p < 0.05) and total additive effect size to 20.7% indicating potential explanatory power of long-term historical covariates on current microbiome (Supplementary Table 6b).
To better capture long-term lifestyle and health effects, we further investigated overall historical trends of variables by using the average across all years and difference (i.e. delta) for continuous variables and counts of event occurrence for categorical variables between each year and the year 2016. Analysis of averaged covariates across the years gave very few results; only average dumpling intake.1995-2016 was significant (adjusted R2 = 0.75%, FDR < 0.1; Supplementary Table 8a), likely as a proxy for more traditional lifestyle. Covariate analysis of change (delta) in historical host parameters identified multiple non-colinear parameters independent of the time period covered (i.e., beta blocker usage, hemoglobin, alanine transaminase (ALT), and non-sport physical activity; db-RDA, adjusted R2 [0.63:1.11%], FDR<0.1; Supplementary Table 8b and Extended Data Fig. 2b). These were again analyzed with 11 significant contemporary covariates to calculate non-redundant cumulative effect sizes. Beta blocker.1990-2016, non-sport physical activity.2005-2016, hemoglobin.1990-2016, and Alanine transaminase.2005-2016 were shown to have significant explanatory power in addition to contemporary covariates, significantly raising the cumulative non-redundant effect size to 8.5% (likelihood ratio test p < 0.05; Supplementary Table 8c).
Finally, we combined all significant contemporary and historical features (Supplementary Tables 5a, 7a, 8a, and 8b), in one comprehensive db-RDA analysis and found a final set of 14 variables significantly explaining the current microbiome variation, of which three were historical ones from 2010 (gamma-glutamyl transferase (GGT), caloric intake, cereal fiber score, one was average dumpling intake (1995-2016), and four were duration of beta blocker treatment (1990-2016), long-term changes in non-sport physical activity (2005-2016), hemoglobin (1990-2016), and alanine transaminase (ALT; 2005-2016). All together, they significantly increased the final cumulative non-redundant effect size to 10.4% (likelihood ratio test p < 0.05; Figure 2a and Supplementary Table 8d) and total additive effect size to 25.5% (Figure 2c and Supplementary Table 6c). Overall, this shows that the inclusion of historical data resulted in a 33.4% increase in non-redundant explanatory power for global microbiota variation.
We further deepened the relationship of these historical variables with the current microbiome by focusing on current absolute taxonomic group abundances as well as community enterotype based on Dirichlet Multinomial Modeling-based clustering previously validated across multiple cohorts13–15. Previous studies detected four enterotypes7, dominated by either Bacteroides (B1 and B2 enterotypes, with B2 having a.o. lower microbial load and abundance of Faecalibacterium compared to B1)16, Prevotella (P), and Ruminococcaceae (R). All four enterotypes were present in the Bruneck cohort. Using the significant historical variables identified in the above RDA analyses (db-RDA, FDR<0.1; Figure 2a and Supplementary Table 8d), we first analyzed beta blocker treatment in association with community diversity. By dividing subjects into three groups (chronic (treatment of beta blocker both in 1990 and 2016), current, and none (not medicated in 1990 and 2016), we found that beta blocker treatment was linked to a significant compositional shift (beta-diversity; Adonis r2 = 0.013, p < 0.001; Extended Data Fig. 2c and supplementary Table 9a), but not to alpha-diversity (Kruskal-Wallis test, p > 0.05; Extended Data Fig. 2d). Prevalence of enterotype was found to be significantly different between the three groups (Fisher’s exact test, FDR < 0.1 for B2 versus B1 and P enterotypes; Supplementary Table 9b). Especially, B2 enterotype was more prevalent in beta blocker treated individuals than other enterotypes (Kruskal-Wallis test, FDR < 0.1; Extended Data Fig. 2e). Further analysis of specific taxonomic associations identified a list of bacteria more abundant in subjects who did not use beta blockers, which can be potential targets for remediation strategies (Generalized linear model (GLM), FDR<0.1, adjusted for age and stool moisture; Supplementary Table 10) if future studies confirm a causal link for this association.
Analysis of average dumpling intake (1995-2016), a historical covariate with the second-largest effect size, showed that P enterotype has had higher dumpling intake than B1 enterotype (Kruskal-Wallis test, FDR < 0.1; Extended Data Fig. 2f and Supplementary Table 11). This could be explained by the association of the P enterotype with long-term dietary patterns15 and perhaps impact of traditional dietary intake or -lifestyle on the current microbiome given that they are a key component of traditional food in the Bruneck region.
We next looked at change in non-sport physical activity between the year 2005 and 2016. We first identified taxa associated with both physical activity shifts (i.e., the change from the past to the present) as well as current levels of physical activity. Although no genera were associated with both variables, butyrate-producing bacteria (i.e., Roseburia, Faecalibacterium, and Butyricicoccus) significantly increased with long-term physical activity (Spearman rho [0.18:0.21], FDR<0.1, adjusted for age and stool moisture; Supplementary Tables 12-13). The positive influence of exercise on gut health has gained recent attention with elevated abundances of Roseburia and Faecalibacterium reported in fit individuals and those who perform regular exercise17–20. In order to study effects of changing physical activity, we clustered subjects into four categories: high activity in the past and at present (cluster 1), high in the past and low at present (cluster 2), low in the past and high at present (cluster 3), low in the past and at present (cluster 4). Interestingly, subjects who have recently increased physical activity as well as those who have consistently maintained high activity exhibited reduced ratio of (dysbiotic) B2 to non-B2 enterotypes. This suggests a beneficial role of physical activity for the healthy elderly gut ecosystem (pairwise Chi-Square test, FDR < 0.1; Figure 2e and Supplementary Table 14a).
Finally, we studied changes in hemoglobin between 1990 and 2016. Analysis of taxonomic association with both current hemoglobin as well as changes showed that another butyrate-producing bacterial genus, Coprococcus, was significantly associated with high levels of current hemoglobin as well as hemoglobin increase over time (Spearman rho [0.15:0.21], FDR<0.1, adjusted for age and stool moisture; Figure 2f and Supplementary Table 15-16). The oxygenation-dependent metabolic state of colonocytes was found associated to gut butyrate producers, also linked to outgrowth of opportunistic aerobic species21. Therefore, our results suggest that not only current but also historical adequate levels of hemoglobin and oxygen could be of value in retaining a healthy microbiome in later life. Using the previous clustering approach, the B2/non-B2 ratio was significantly higher in hemoglobin cluster 2 (high-to-low) compared to cluster 4 (low-to-low) (pairwise Chi-Square test, FDR < 0.1; Figure 2g and Supplementary Table 14b) implying that rather a drop of hemoglobin is associated with dysbiosis than having consistently low levels throughout the years. Higher abundance of Coprococcus (butyrate producer) in cluster 4 versus cluster 2 is in line with this finding (GLM, FDR < 0.1, adjusted for age and stool moisture; Supplementary Table 17). As an accurate reflection of anemia, low hemoglobin levels can be a feature of recent major bleedings or surgery or of inflammatory, infectious, or neoplastic disorder22. Therefore, a drop in hemoglobin levels could be an indication of many conditions as well as poor health. Analysis of changes in ALT between 2005 and 2016 using the same approaches as above showed that only the current ALT levels were significantly associated with Methanobrevibacter but not with the enterotype ratio (Spearman rho = -0.18, FDR<0.1, adjusted for age and stool moisture; Supplementary Table 18, Extended Data Fig. 2g).
Next, we estimated the predictive power of past lifestyle and physiological parameters on current microbiome composition. We investigated both effect sizes of single markers and predictive potential of more complex machine learning-based models. To explore single markers, we first investigated long-term predictability by focusing on the three significant individual historical variables from the year 2010 (db-RDA, FDR < 0.1; Figure 2a and Supplementary Table 8d) and enterotypes, but no findings emerged (Kruskal-Wallis, p>0.05; Supplementary Table 19). The lack of statistical significance for these variables implied a limited predictive power by single parameters. Therefore, we sought to determine the predictive power of life history using a combination of variables to predict current microbiome enterotype, as well as to investigate how far back we can use this combined information. To this aim, we applied a Random Forest classifier with feature selection to predict the current enterotype for each past sampling year based on diet, blood parameters, health, anthropometrics, and lifestyle using only variables that were available across all years for parallel comparison. Models derived from a random training dataset (70% of the total) were applied to test data (the remaining 30%) in order to estimate predictive power and avoid overfitting. Models gave fair levels of prediction for the Prevotella (P; for 2000 and 2016) and Bacteroides2 enterotype (B2; for 2010; Figure 3a and Extended Data Fig. 3b). The year 2005 has overall lower predictive power possibly because it is a switching year in age demographics (see Extended Data Fig. 3a). Parallel 10-fold cross-validation identified additional fair levels of prediction for the R enterotype in the year 2010 (Extended Data Fig. 3c). Interestingly, the prediction variables selected for each sampling year showed distinct patterns for the various enterotypes (Figure 3b and Extended Data Fig. 3d). For example, variables belonging to multiple categories were selected for B2, while for the P enterotype a much larger contribution of dietary variables was visible across the years. Unlike the two other enterotypes, in B2 more variables from the medication class were selected in the three recent time points implying an association with health deterioration (Extended Data Fig. 3e and Supplementary Table 20). Overall, our study provides first evidence that the current gut microbiome is indeed predictable by past variables. Nonetheless, further validation in independent cohorts with a similar long-term sampling protocol would be warranted to confirm these results.
In conclusion, we show that an individual’s life history has long-term effects on the assembly of the gut microbiome. We report, for the first time, the predictability of the current gut microbiome by historical host parameters along with distinct characteristics of the elderly microbiome using a quantitative approach. In a large-scale, lifespan covering combined cohort (n = 2,519), we confirm a positive correlation between community richness and age. There is compelling evidence that the gut microbiome rapidly expands from birth until early childhood and stabilizes in adulthood23–25. Our results indicate that the gut remains dynamic at high ages and is influenced by the host’s life history. Specifically, we found that changes in an individual’s medication history, non-sport physical activity, and hemoglobin levels over time were linked to the individual’s current microbiome. Further, we could predict an individual’s current enterotype based on host parameters from as early as 15 years prior. Overall, these results suggest that long-term history of host laboratory parameters, medication, diet, and lifestyle can exert significant impacts on the current microbiome, highlighting first key variables that are important for maintaining a healthy gut at a later life stage.