Cecum microbiome composition in rabbits
Most of the reads generated from the whole metagenome sequencing of cecum samples (74.8%) were assigned to the phylum level and only 3.19% to the genus level. The low assignment rate at the genus level compared with the phylum level is due to the fact that LCA algorithm in SqueezeMeta applies very conservative identity thresholds to make reliable taxonomic assignments (60% for genera and 42% for phyla, as suggested by Luo et al.(65)). The caecal microbiome was mainly composed of Firmicutes (83.12% of the total abundance) and Bacteroidetes (13.45%) phyla, and other subdominant phyla such as Verrucomicrobia (1.19%), Proteobacteria (0.76%), Actinobacteria (0.76%), and Tenericutes (0.1%). Other bacterial phyla such as Fibrobacteres, Chloroflexi, Armatimonadetes, Elusimicrobia were less represented with relative abundances ≤ 0.01%. Methanogenic archaea from the phylum Euryarcheaota had a relative abundance of 0.04%, while in the fungal phylum Ascomycota accounted for 0.12%, Chytridiomycota for 0.002% and Basidiomycota for 0.0001%. At genera level, Ruminoccocus was the most abundant identified genus (21.7%), followed by Alistipes (19.6%), Bacteroides (5.43%), Clostridium (5.16%), Faecalibaculum (4.45%) and Akkermansia (3.54%). The most abundant methanogenic archaea was Methanobrevibacter with 0.52% abundance (see Additional files 2 and 3).
Because microbiome data are compositional, log-ratio transformation is strongly recommended to avoid spurious results when comparing groups or calculating microbiome-microbiome or microbiome-trait associations(51). Recently, we proposed to analyze –omics data using an additive log-ratio transformation in which the denominator is a variable with low variance between samples and high Procrustes correlation with the standard pairwise log ratio system(52), so it does not bias the true Euclidean distances between samples. In the taxonomic composition of the caecal microbiome, phylum Fibrobacteres and genus Eubacterium were the best candidates for use as denominators in the transformation because they preserved the Euclidean distances between individuals calculated based on all pairwise log-ratios standard (Procrustes correlations of 0.992 for phylum and 0.998 for genus, see Additional file 4 and Additional file 5). The variances of their natural logarithms were very small (0.0165 for phylum and 0.0442 for genus), which attributes most of the variation in the additive log ratio to the numerator, simplifying its interpretation.
Microbiability of IMF
The term microbiability, first used by Difford et al.(66), quantifies the percentage of phenotypic variation between individuals of a certain trait that is due to the variation in their microbiome composition. Our results suggest that the microbiome explains a large proportion of IMF variation. When the microbiome relationship matrix was fitted as Euclidean distances in alr-transformed genera abundances, microbiability was 44.5% [15.4%, 74.8%], with a probability of being greater than 25% of 0.86; and in alr-transformed phyla abundances it was 23.7% [7.4%, 43.8%], with a probability of being greater than 25% of 0.38. The better fitness observed at lower taxonomic resolution can be explained by the fact that clustering the counts at higher taxonomic ranks could hide the variation among individuals that arises from genera with versatile functions for lipid metabolism. The greater explanation ability at low taxonomic levels also observed for resilience traits in rabbits(67). Due to the divergent selection, the microbiability is greater in our study than in other populations. However, the median of the microbiability distribution estimated using genera was still considerable, 38.8% [12.9%, 70.9%] in the H line and 43.2% [13.5%, 76.2%] in the L line when calculated separately for each line; and the probability of the microbiability being greater than 25% was 0.81 in H and 0.83 in L. Although there are no previous records of IMF microbiability in rabbits, other studies performed in pigs provided lower values ranging from 6%(68) to 26%(69, 70). However, these studies include additional random effects (e.g. a genomic effect) in the models whose allocated variances partially overlap with the variation accounted for by the microbiome, and are therefore not comparable with our estimate. In addition, different methods were used to calculate metagenomic distances between individuals, the effect of which can make microbiability vary up to 20%(70). In our data, the microbial variation between rabbits that fits the IMF captured by microbiability can be divided into: (i) microbial abundances affecting IMF and influenced by host genetics; and (ii) microbial abundances affecting IMF but not influenced by host genetics. However, when examining the correlated response to selection for IMF in the microbiota by comparing the microbiome of the divergent lines, we mainly target the part of the microbiome that is influenced by the genes involved in IMF and inherited by classical Mendelian vertical transmission. In other words, changes in allele frequencies for the mentioned genes that favour colonisation by such microbes. While vertical transmission of microbes unaffected by host genetics may occur from the selected mother to the kit via the birth canal, lactation, and/or coprophagy of the mother's soft faeces in the nest(71), the stability of such microbial populations in the digestive tract of the rabbit under standardised environmental conditions remains unclear(72–75). Moreover, the studied population in the 10th generation consists of 89 animals randomly selected from the two lines, and the microbial abundances influencing IMF and strongly influenced by the environment should be randomly distributed among the animals.
Divergent IMF-enterotypes as a response to selection
The differences between IMF lines on the microbiome features are expected to be mainly due to host genetic divergence caused by selection. However, our finite rabbit population is also subjected to random genetic changes due to genetic drift that could also affect the microbiome. This genetic drift was corrected by focusing on the microbial taxa that differed between lines and were also linearly associated with IMF, as first proposed in Martínez-Álvaro et al. (37).
Alpha diversity was not associated to host IMF genes, as H and L lines had similar microbial diversity, measured as an adjusted alpha diversity index at either the phylum (\({H{\prime }}_{adj}=\)0.13 with s.e. of 0.02 for the H line, and \({H{\prime }}_{adj}=\)0.12 with s.e. of 0.02 for the L line, p-value of the Kruskal-Wallis test was 0.29) or genus level (\({H{\prime }}_{adj}=\)0.51 with s.e. of 0.04 for the H line, and \({H{\prime }}_{adj}=\)0.51 with s.e. of 0.037 for the L line, p-value=0.85). Other studies have also shown that there is no association between alpha diversity metrics and adiposity traits(76). However, selection altered specific microbiome abundances in the cecum at the phyla and genera levels, being the effect of selection more evident at lower taxonomic resolution, which allowed us to distinguish two divergent enterotypes related to IMF genetics. First, divergent microbial abundances between lines were identified by fitting a PLS-DA model at phyla and genera level (Additional files 6 and 7). Based on 14 alr-transformed phyla and 66 alr-transformed genera, PLS-DA models presented a cross-validation classification accuracy between lines of 71.0% (phyla) and 94.3% (genera), which decreased to 51.8% (phyla) and 48.9% (genera) when permuting the labels of the samples. Then, microbial abundances that explained IMF variation were selected by a linear PLS regression. Seventeen alr-transformed phyla and 74 alr-transformed genera identified predicted IMF variation with a Q2 accuracy of 11.8% (phyla) and 65.7% (genera), in cross-validation (Additional files 8 and 9), which decreased to -14% (phyla) and − 31% (genera), when IMF values were permuted. Finally, ten alr-transformed phyla and 51 alr-transformed genera that overlapped in both PLS-DA and PLS models were used to define the two IMF enterotypes at the phyla and genera levels (Additional file 10 and Fig. 1A).
The 51 alr-transformed genera differed markedly between the two IMF enterotypes corresponding to the two IMF lines. When fitted as predictors in a new PLS-DA (Additional file 11 and Fig. 1B), the two IMF enterotypes were clearly distinguished with 91.2% accuracy in cross-validation (48.5% after permutation). The PLS-DA algorithm aims to maximize the covariances between the dependent (H/L) and independent (microbiome data) variables, which could inflate the discrimination of the two enterotypes(58). To confirm our results without forcing the algorithm to distinguish between the two groups, a MDS analysis was computed, commonly used to define enterotypes(60–62), and a PERMANOVA to test the line effect (Fig. 1C), which declared the line effect to have a p-value ≤ 0.001. The two IMF-enterotypes based on the 10 alr-transformed phyla were less differentiated (Additional files 10 and 12). In this case, the MDS sample plot did not differentiated between lines, and the PERMANOVA test resulted in a line effect with a p-value ≤ 0.01.
Main microbial drivers of the two IMF-enterotypes
The main drivers of the two IMF-enterotypes at phylum and genus levels were examined based on the VIP of the PLS-DA models fitted with either 51 alr-transformed genera or 10 alr-transformed phyla (VIP > 1 was considered relevant(77)). In addition, enrichment or depletion of alr-transformed microbial abundance between the two IMF enterotypes was quantified as the differences between lines (in units of SD, also known as effect size) by fitting a linear model that included the sex effect (Fig. 1D and Additional files 10 and 13). At the phyla level, the two enterotypes were dominated by either Verrucomicrobia and Chloroflexi phyla in the H-enterotype (+ 0.63 SD and + 0.42 SD greater abundance, with a probability of this difference being greater than 0 (P0) ≥ 0.97), or Armatimonadetes and Euryarcheaota in the L-enterotype (-0.46 and − 0.43 SD, P0 ≥ 0.98). In addition, the H-enterotype was enriched in Proteobacteria (+ 0.34, P0 = 0.94), Candidatus Saccharibacteria (+ 0.33 SD, P0 = 0.93) and Elusimicrobia (+ 0.31 SD, P0 = 0.92) bacterial phyla. Interestingly, the two fungal phyla Chytridiomycota and the much less abundant Basidiomycota were also enriched in the H-enterotype (+ 0.34 SD, P0 = 0.94); however, the L-enterotype was enriched in Ascomycota (-0.45 SD, P0 = 0.98).
Several studies indicate that phylum Firmicutes are more abundant than Bacteroidetes in obese individuals (higher FB ratio)(19–23), Firmicutes are thought to be more effective than Bacteroidetes in extracting energy from food, promoting more efficient caloric intake and subsequent weight gain(13). However, Firmicutes abundance or FB ratio was not greater in the H line. The FB ratio was 7.51 (SD = 0.55) in the H and 7.16 (SD = 0.59) in the L line, and there was a low statistical evidence of a greater ratio in the H line (H-L = 0.38 and SD 0.79, P0 = 0.68). Many other studies dispute this association between FB ratio and obesity(24, 78, 79) proposing that compositional changes at the family, genus, or species level could be more influential(18), owing to the considerable heterogeneity of the gut microbiome. Our findings align with this perspective. Out of the 51 genera distinguishing the enterotypes, 31 belonged to Firmicutes and 5 to Bacteroidetes, with enrichment either in H (16 Firmicutes and 3 Bacteroidetes) or L (15 Firmicutes and 2 Bacteroidetes) enterotypes. H-enterotype was mainly enriched in Hungateiclostridium, Limosilactobacillus, Lysinibacillus, Porphyromonas and Gemminger (effect sizes from + 0.5 SD to + 0.67 SD; P0 ≥ 0.99); meanwhile L-enterotype was enriched in Fonticella, Megasphaera, Exiguobacterium, Flintibacter, Coprococcus, Candidatus Amulumruptor and Odoribacter (effect sizes from − 0.33 SD to -0.61 SD, P0 ≥ 0.93). The genus Ruminoccus (21.7% of total relative abundance), Lactobacillus and Turicibacter were also enriched in the L-enterotype, with effect sizes of -0.39, -0.38 and − 0.35 SD (P0 ≥ 0.95) and VIP ≥ 0.86. The versatility of various genera belonging to phyla Firmicutes and Bacteroidetes concerning obesity, observed in our study, is supported by prior research. For example, Gemminger was found to be more abundant in obese children than in normal weight individuals(78); Erysipelatoclostridium, an H-enterotype enriched genus within the Firmicutes (+ 0.45 SD, P0 = 0.98), was linked to diet-induced obesity in gnotobiotic mice(80); and periodontal infection with the Bacteroidetes genus Porphyromonas induced obesity, glucose tolerance, hepatic steatosis and altered endocrine function in brown adipose tissue when administered to pregnant(81) or obese(82) mice. Conversely, Flintibacter is linked to bile acid metabolism(83) (secondary metabolites of bile acids were also more prevalent in the L-enterotype(36), along with more cholesterol(32)). Obese mice on a high-fat diet had lower Turicibacter(23) abundance compared to controls, and certain Lactobacillus strains were also associated with reduced fat storage(84, 85), although this can vary by strain(86, 87). The enrichment of the Firmicutes genera Coprococcus, Megasphaera and Ruminoccocus in the L-enterotype could relate to greater microbial propionate production via acrylate fermentation (specifically Coprococcus catus and Megasphaera eldesnii) or the propanediol pathway (Ruminococcus obeum)(88). This is consistent with our previous functional metagenomic study, where propionyl-CoA carboxylase pccA gene abundance, found in the genome of taxa from the order Clostridiales (which includes Coprococcus and Ruminococcus) and the family Ruminococcaceae, was greater in the L line,. Propionate inhibits de novo liver lipogenesis(89), lower in the L line(32), and promotes leanness by stimulating anorexigenic satiety hormones via G-protein coupled receptors, thereby inhibiting food intake, which was also lower in the L line(90). Given that rabbit caecal short-chain fatty acids likely account for 30–50% of energy requirements, this could significantly affect their energy metabolism(91). Finally, Ododribacter, characterizing the L-enterotype, is a succinate-consuming bacteria found at lower abundance in obese individuals(92), raising succinate levels, known to have adipose-tissue inflammation properties(93). Also, a lower abundance of sucC, which catalyzes the synthesis of succininyl-CoA from succinate, was observed in the L-enterotype in our functional metagenomics study(37).
Several Proteobacteria genera also played an important role in differentiating the two enterotypes. H-enterotype was enriched in Oxalobacter, Desulfovibrio, Bartonella, Legionella and Sutterella with effect sizes ranging from + 0.37 to + 0.57 SD, and P0 ≥ 0.96. Instead, only Escherichia (Fig. 1C) dominated the L-enterotype with a large effect size (-0.69 SD, P0 = 1.00). Zhang et al. (2020) observed an increase in Desulfovibrio abundance in high-fat-diet-induced obese mice with greater fat mass and liver weight than controls(23), linked to greater fatty acid synthase gene expression in the liver. The greater propensity of the L-enterotype to harbour Escherichia may be due to its ability for secondary bile acids conversion by metabolising tryptophan(88), certain Lactobacillus species also sharing this capability(88). Primary bile acids synthesized in the liver from cholesterol aid lipid digestion in the gut through its conversion into secondary bile acids by few members of gut microbiota (e.g. Escherichia and Lactobacillus)(88, 94). In the gut, bile acids regulate microbial survival and also activate inflammatory pathways when they bind to cellular receptors in different organs of the body such as liver, stomach, ileum, colon, or white adipose tissue(95) (e.g., G-coupled receptor TGR5(96) and farnesoid X receptor(97)). Our previous research noted correlated responses to selection in microbial genes involved in bile acid transport(37); and metabolomics showed higher secondary bile acid metabolites and cholesterol levels in the L line(36), possibly linked to Escherichia and/or Lactobacillus actions in the cecum.
Another interesting result was the enrichment of the abundant genus Akkermansia (3.53% relative abundance) in the H-enterotype (+ 0.31 SD, P0 = 0.93); contrary to expectations based on studies associating this mucin degrader with alleviating metabolic syndrome (e.g. by stabilization of the gut barrier function, the secretion of antibacterial peptides, or control of inflammation)(98, 99). However, these studies primarily explored dietary interventions, not genetic background of IMF. In our functional microbiome analysis(37), a positive correlated response to selection in microbial genes involved in the biosynthesis of lipopolysaccharides, peptidoglycans and other microbial endotoxins known to contribute to the development of fat mass(9, 13) was found, which were allocated in the genome of Akkermansia, suggesting Akkermansia’ s potential role in genetically-induced obesity, as confirmed in this taxonomic analysis. One hypothesis is Akkermansia muciniphila’s involvement in acetate production through pyruvate decarboxylation to acetyl-CoA(88). Acetate, more obesogenic than propionate, enters the systemic circulation and reaches peripheral organs such as the liver, where it stimulates hepatic lipid synthesis(100) and constitutes the main substrate for de novo lipogenesis in rabbits(101), and the brain, where it influences appetite and fat storage by promoting the secretion of insulin by the pancreas and ghrelin by the gastric mucosa(102). Finally, the L-enterotype was characterized by an enrichment of the ubiquitous methanogenic archaea Methanobrevibacter (0.52% relative abundance), with − 0.51 SD, P0 = 0.99, consistent with the greater abundance of hdr involved in methane metabolism(37). Obesity has been associated with increased H2 transfer between bacterial fermenters and H2 using-species like methanogens (13, 20, 103). Zhang et al. (2009) (104) suggested that this mechanism, which releases intraluminal pressure, promotes fermentation of non-digestible carbohydrates and thus short-chain fatty acid production and energy intake. However, it remains unclear how increased energy production can occur while carbon is lost through methane release(79). Moreover, acetogenesis and sulfate reduction are considered thermodynamically more favourable pathways for H2 depletion in monogastric animals (e.g. rabbits(105) and humans(106)) than methanogenesis. More likely, the greater methanogens abundance observed in the L-enterotype is hypothesized to result in increased carbon loss from the use of H2, reducing energy uptake, as hypothesized in pigs and humans(86, 107, 108).
In addition to characterising the H and L-enterotypes, putative associations between these microbial genera were examined to investigate putative symbioses or competencies between microbial communities (Additional file 14). Our results show weak Pearson correlations between the 10 selected phyla (≤ |0.11|) and between the 51 selected genera (≤ |0.49|), with the exception of the genera Kazachstania and Saccharomyces (Pearson correlation = 0.83).This study demonstrates a correlated response to selection on IMF in the composition of the caecal microbiota; however, the genetic host mechanisms that produce the optimal environmental conditions to favour colonization of the taxa enriched in each enterotype are unknown. As an example, Akkermansia is associated with a genetic marker near PLD1, a gene already known to affect body mass index(17). More generally, studies suggest that host genes that modulate the microbiota may be related to the presence of receptors responsible for host-microbiota interactions (e.g. RAPGEF1, encoding a protein factor associated with G protein-coupled receptors(109)), with the secretion of fucosylated mucus glycans in gastrointestinal physiology and mucosa, with salivary amylase, lactase persistence, or various gastric enzymes(91, 109–113). In any case, characterization of the host genes that modulate these microbial mechanisms is beyond the scope of this study, and an appropriate genomic association study targeting the abundance of these microbes is needed to elucidate this point.
From a practical perspective: microbial biomarkers to predict host IMF genetic background
From a practical point of view, finding a simple microbial biomarker that can be used to classify new animals into the two enterotypes or predict their IMF could be an attractive outcome. To this end, selbal algorithm(64) was used to identify the smallest number of microbial genera that, when combined in a single microbial balance or isometric log-contrast have the highest classification or prediction accuracy. This differs from PLS-DA or PLS, whose goal in this work was to identify all microbial variables with discriminatory ability between lines or IMF prediction ability, without necessarily avoiding redundancy. The purpose of the selected microbial balance is classification or prediction and not a new representation of the data as alr, as explained in Rivera-Pinto et al. (64). The microbial balance for IMF linear prediction after CV was formed by 26 microbial genera abundances from the 51 genera having a response to selection. It contained the geometrical mean of Legionella, Lysinbacillus, Porphyromonas, Marasmitruncus, Lactonifactor, Hungateiclostridium, Trichuris, Methanosphaera, Moorella, Akkermansia, Marvinbryantia, Faecalitalea, Aminipila, Limosilactobacillus, Murimonas, Listeria, Desulfovibrio and Gemminger in the numerator; and Escherichia, Massilistercora, Coprococcus, Actinomyces, Ruminococcus, Catenibacter, Flintbacter and Megasphaera in the denominator (Fig. 2A). When this balance was fitted as a covariate in a linear regression to predict IMF, prediction accuracy was 0.69 (SD of 0.12) when cross-validated, and in a logit regression for classification between enterotypes, classification accuracy was 93.04% (SD of 0.25%). Note that fitting a linear regression to predict IMF based on the selected balance assumes that the same biological principles govern IMF when it is selected to increase (H-line) and decrease (L-line). However, if this is not the case, a separate within-line analysis would be more appropriate. A less complex balance was additionally obtained by limiting the number of genera comprising the balance up to 10. In this case, a balance formed by 9 genera composed of the geometric mean of Legionella, Lysinbacillus, Phorpyromonas, Marasmitruncus and Lactonifactor in the numerator and the geometric mean of Escherichia, Massilistercora, Coprococcus and Actinomyces in the denominator (Fig. 2B) was selected. This balance was capable to predict IMF with a prediction accuracy of 0.50 (SD of 0.18) and to classify between the enterotypes with 82.4% (SD of 1.1%) accuracy. After further external validation, these biomarkers could be used to predict the genetic background of individuals with IMF.