Nutrition-related chronic diseases are widely considered to be examples of dysbiosis in which their incidence is strongly shaped by socio-economic influences on the human diet. In this study, we aim to identify phenomena associated with nutrition transition that may drive microbiota patterns at the level of society. Particularly, we looked for microbiota features distinguishing pre-transition societies from those of post-transition; and further, to explore the dimensions of the environment which are most likely to have driven the difference. We postulated that features discriminating pre- and post-transition should show intermediate values in samples of societies actively undergoing transition. If true, then sampling such active-transition societies is predicted to provide a more useful dynamic response range for the identification of environmental drivers.
To test our hypotheses, we focussed on the concept of microbial enterotypes. It has been proposed that categorizing microbiotas into subtypes based on dominant features of their community structure (the enterotype concept) is a potential discriminator between pre- and post-transition societies [17]. Here, we tested this proposition, and its potential correlation to diet and disease, in a cohort of young female from Bali which represents a narrow demographic group from a society that has only very recently undergone transition [23, 24, 40, 41]. The analysis of diet and microbiota characteristics in this distinctive cohort, integrated with a meta-analysis of representative human microbiota datasets, provides new insights into potential drivers of enterotypes and their relevance to public health application.
The enterotypes concept was first proposed by Arumugam et al (2011) who coined the term to distinguish three clusters found in a multi-country human gut microbiota dataset based on Jensen-Shannon distance [15]. Similar patterns have since been reported in numerous other studies [5, 14, 16, 17]. However, there has been much controversy over how to use the concept of enterotypes as a tool to simplify the description of patterns in the human microbiota. It is evident that enterotypes are not intrinsically discrete biological entities; rather their observation is a property of the methods used to visualize beta diversity in the data [16, 17]. Consequently, a robust definition of enterotypes for the purpose of assessing their representation across populations in meta-analyses is very challenging. In a recent perspective article, Costea et al. proposed a unified nomenclature and guidelines for categorization aimed at facilitating the use of the concept for diverse datasets [17]. They proposed the terms ET-P, ET-B, and ET-F for referring to clusters where the attractor for that cluster is the taxon Genus:Prevotella, Genus:Bacteroides, or Class:Firmicutes. In this study, we have largely adhered to those guidelines, although taken a distinct approach to assessing the application of the concept in our meta-analysis using generalized additive models of marker taxa to support the findings.
In our beta diversity analyses of the Bali cohort, we found strong support for segregation into two clusters, which we refer to as Type-P and Type-B in accord with the proposed enterotype nomenclature. We used rigorous statistical tests to scrutinize these groupings, which showed that Prevotella and Bacteroides were indeed the main drivers of the clustering (although not the sole driver). These two genera have been consistently reported to differ between industrialized and subsistent societies [1, 5, 7, 11, 15–17, 42]. In our network analysis, co-abundance patterns were generally consistent with the cluster analysis – Prevotella or Bacteroides as the ‘hub genus’ were significantly over-represented in individuals categorised to the designated community type. However, there were samples in which neither Prevotella nor Bacteroides are dominant. Although we did not see a third cluster in the Bali cohort, we note that four individuals in the Type-B cluster were dominated by CAG4, which includes various genera typically associated with the proposed third enterotype (ET-F). Our data are thus consistent with the existence of three similar types of community structure, but their visualization through clustering methods was not possible in the Bali dataset.
We then tested the proposition that the enterotypes would discriminate pre- and post-transition societies; and furthermore, that metrics based on these community types would be related to the stage of nutrition transition in that society. A challenge for this meta-analysis is that both the enterotype assignment and the determination of CAGs share a similar limitation in that they are products of a data-dependent clustering approach. As such, they are not suitable for a meta-analysis across multiple datasets. As a result, we did not see clusters that would permit facile classification into two or three community types in our analyses of the combined dataset, nor was this found in a similar study by Gorvitoskaia et al. [5]. As an alternative, we used the hub taxa, Prevotella and Bacteroides, to quantify Bali’s position in the multi-country dataset and projected their relative abundance data onto the ordination using regression models. The result showed that Prevotella and Bacteroides abundances were strongly correlated to major differences in microbiotas across six populations.
Our data also showed that Prevotella and Bacteroides abundance in Bali represented an intermediate value if compared to pre- and post-transition societies. Our findings are consistent with our hypothesis that a nutritional transition is associated with a gradient of diverging communities states (measured here as enterotype and hub taxon incidence) – whereby the Bali cohort broadly occupied an intermediate position between pre- and post-transition societies in the multi-country dataset. These findings suggest that Bali is undergoing a shift in the meta-microbiota of their society. A plausible explanation for this is that recent industrialization of the food system have driven changes in microbial exposure and diet [23, 24, 40, 41], both of which are known to influence community assembly.
It has previously been found that enterotypes correlate to diet, including choices between animal or plant-based food [13, 14, 22]. Due to its association with nutrition, enterotypes has also been associated with increased incidence of cardiometabolic disorders (e.g. obesity, cholesterol) [17, 43–46]. To investigate the extent to which nutrition may alter the microbiota and affect obesity incidence, we employed various combinations of multivariate regression models on the Bali dataset. Our findings in these explorative analyses showed a striking lack of significant correlation between CAGs or community type to contemporary diet. Neither do we find associations with carbohydrate and fibre source (primarily rice). We found only a weak association between Bacteroides enrichment and higher protein intake (predominantly meat in the Bali dataset).
The lack of significant patterns may be due to the limited variety of food items in the Bali cohort, and thus these findings would require verification in a larger sample set. Nevertheless, our data do not support the idea that patterns of community type (or marker taxa) within a society will be reflected by short-term diet observations. This does not exclude effects of long-term diet as a driver microbiota differences across populations. It is possible that different patterns of Prevotella and Bacteroides relative abundance between subsistence and industrialized societies may reflect different levels of fibre intake, animal-based food intake, and fat-sugar-enriched processed food products [5, 13, 14], but the drivers of this difference are likely to be more complex than short-term diet habits.
Although our Bali cohort is all young females, all of them were raised during the period of rapid lifestyle transition. Given that: 1) the timescale over which diet influences the observation of enterotype-like patterns, or the time point in developmental history when the patterns become fixed, is not yet clear; 2) enterotypes can become stable in early life; and 3) trans-generational microbe exposure influences microbiota diversity; it is possible that differences in their early development may have driven the striking divergence in microbiota pattern. Our data are consistent with the Thai US immigration study, which reported that living in an industrialised society induced a replacement of Prevotella with Bacteroides in the human gut and that it progressed across generations [8]. Recent studies in the Central African Republic (BaAka Pygmies and Bantu populations) and Nepal (Himalayan populations), also showed similar transitional states in the microbiota between rural and urban populations [6, 9]. Collectively, these data indicate that changes in human socio-economic conditions over time are reflected in the frequency distribution of Prevotella and Bacteroides across human societies.
Claims for microbiota association with nutrition-related disease began with obesity [44, 47]. Although the small size of the obese cohort in this study limits our ability to detect robust differences, our findings are instructive with regards to the use of simplifying metrics in health associations. Five of the six obese Bali individuals in our study were identified with a Type-P community which had lower microbial diversity. But despite this difference in diversity, it is unclear whether the difference is exclusively due to obesity or because the trait is linked to Type-P community. These findings present a contrast to the prevailing view that microbiotas commonly found in pre-industrial societies, typified by Type-P (or ET-P), have higher diversity and lower risk for NRCD [17, 43–46]. However, we point out that the apparent associations of diversity and obesity with Type-P here could be a product of the categorization into community types. Those Type-P individuals classified as obese in our study were distinguished by higher abundance of CAG9 and it was only this CAG that showed any statistical support in multi-variate models.
Costea et al. have proposed that one means of circumventing the issue in enterotype categorization is by enterotype assignment through relation to a reference dataset [17]. In our view, this approach has potential merit, but it does not address the main limitation of the method. The concept of enterotyping (at least as it has so far been promulgated) is essentially simplifying community structure to one dominant dimension. Therefore, its application to predicting the influence of the microbiota in individuals (such as disease risk or treatment outcome) will be dependent on the assumption that a significant fraction of relevant influence of the microbiota for that outcome is captured in the set of one-dimensional categories. Even in our small data set, it is clear that relying solely on enterotype classifications obscures the multi-dimensional nature of microbiota contribution to health outcomes [16]. However, our data also show that enterotype markers do have a strong predictive value for some population-level characteristics. We propose that the concept of enterotypes has very limited value in predicting properties of an individual, but that it is useful in predicting properties of a demographic group and thus enterotypes may inform study design.
Together with other findings, our data highlighted the biological significance of enterotypes, particularly their implications toward disease association in future studies. Failure to address population heterogeneity between experimental groups may lead to false discoveries of disease biomarkers, particularly in societies undergoing socio-economic change. As shown in this study, the presence of two distinct community types in the Bali microbiota is a significant confounder for identifying obesity markers. We anticipate that microbiota disease association studies will be more robust if consideration is given to the baseline distribution of enterotypes (or proxy taxa) – since the enterotype is predicted to reflect extrinsic and intrinsic factors influencing community composition in the sampled population. In terms of human societies, these factors can include, but not limited to, ethnicity, cultural restrictions, early-life development, nutritional choices, other demographic factors, disease co-factors, pre-existing community composition, and inter-species relationships within the microbiota.
It has long been recognized that clinical trials require treatment groups to be matched for confounding factors such as age, gender, and socioeconomic status. An emerging model is that the gut microbiota exhibits the property of multi-stability; but in the stable states adopted, a large proportion of microbiota variance that occurs across human societies can be explained by the nutrient environment and inter-species interactions influencing the assembly. Whilst acknowledging the caveats described above, we propose that the enterotype concept can potentially underpin the development of useful tools to facilitate validation of life-history matched cohorts in studies of NRCD. This may be achievable through relatively simple proxy measures of enterotyping, such as abundance-ubiquity relationships of Prevotella in study cohorts; or simple indices for Prevotella and Bacteroides [5, 8]. Such microbiota-matched cohorts could inform the design of clinical trials and ultimately development of precision medicine strategies for obesity and comorbid diseases.