The microbiome composition is determined by GxE interactions in BXDs
In this study, we leveraged a BXD mouse panel of 32 strains fed with a CD or HFD from 8 to 29 weeks of age to explore the genetic and dietary determinants of the gMxB. As described previously9, BA data was collected from plasma (20 weeks), feces (24 weeks), and liver (29 weeks. Plasma samples were collected before (T0), 30 minutes (T30), and 60 minutes (T60) after re-feeding with their respective diet. These mice were euthanized in a controlled postprandial state (Method) and multiple organs were collected for investigation (Fig. 1A). As expected, HFD feeding significantly changed the body weight and weights of multiple tissues of the mice (e.g., cecum) (Fig. 1B). To elucidate the dietary effects on the gut microbiome globally, we first performed principal coordinate analysis (PCoA) and found that mouse microbiome profiles (obtained from 16S ribosomal RNA sequencing) were significantly separated by diet (Fig. 1C), suggesting that 21 weeks of HFD feeding shifted the microbiome composition in BXD mice. In line with previous studies31,32, the alpha diversity — the microbial diversity of an ecological community — was significantly decreased upon HFD, whether we consider the richness (the number of observed bacteria within a sample), evenness (the distribution of bacteria within a sample), or Shannon index (the bacteria diversity within a sample) (Fig. 1D). The identified bacteria belonged to nine phyla, among which Firmicutes (also known as Bacillota) was the most abundant one, followed by Bacteroidota (also known as Bacteroidetes), in both CD and HFD (Fig. 1E).
Changes of gut microbiome in response to HFD were strain-specific (Fig. 1E and S1A). For example, the abundance of Firmicutes was increased upon HFD in the C57BL/6J strain, while it was decreased in the DBA/2J (Fig. 1E). Similarly, the Shannon index was significantly decreased upon HFD globally and in most BXD strains, but it was negligible in certain BXD strains (e.g., BXD6 and BXD11) (Fig.S1A). We also found that the Bray–Curtis dissimilarity of individuals within a strain is significantly smaller than that of individuals between strains separately under CD and HFD conditions, suggesting that genetic background profoundly affects microbiome composition in the BXD population (Fig.S1B). Overall, these observations not only illustrated the HFD effects on the gut microbiome, but also highlighted the important roles of genetic background and GxE, enabling us to investigate genetic factors regulating the diversity of diet-induced gut microbiome challenges.
Effects of HFD feeding on the genus-level cecal microbiome and colon transcriptome
To further investigate changes in gut microbiome abundance upon HFD, we performed analysis of composition of microbiomes (ANCOM) at the genus level of the gut microbiome in a strain-independent mode and identified 18 differentially abundant genera (DAGs) with absolute log(Fold change) > 1 and Benjamini-Hochberg(BH)-adjusted P value < 0.05 (Fig. 2A). Among these 18 DAGs, 12 belonged to the Firmicutes phylum (in dark blue). Of note, Lactococcus was the most increased genus upon HFD, while Turicibacter was significantly enriched in CD but was barely detected in HFD. Lachnospiraceae UCG-001 was another genus that significantly decreased upon HFD. The changes of these genera upon HFD were consistent with previous reports31,33,34. We then performed ANCOM for each strain and found that not all bacteria could be identified in all BXDs, indicating the impacts of genetic background on gut microbiome composition (Fig. 2B). For example, Siraeum group and Turicibacter were not detected in almost half of the BXD strains in both diets (in grey) (Fig. 2B). As expected, the bacterial changes upon HFD in most BXD strains were concordant, but not in all BXD strains, suggesting an effect of GxE. For example, the abundance of Acetatifactor was increased upon HFD in BXD81 and C57BL/6J, whereas it was decreased or not detected in the other BXD strains (Fig. 2B).
To explore the effect of HFD on host gene expression and its interaction with the microbiome, we profiled the colon transcriptome. While the transcriptomes of HFD- and CD-fed mice did not separate well from each other in the principal component analyses (PCA, Fig.S2A), we obtained similar differentially expressed genes as was previously described in another BXD study35 upon HFD but collected during the fasted state. For example, serum amyloid A 1 and 3 (Saa1 and Saa3), which are involved in the inflammatory response36, were up-regulated, while the cytochrome P450, family 2, subfamily c, polypeptide 55 (Cyp2c55), which prevents colon tumorigenesis37, and Carboxylesterase 2 (Ces2a), which was decreased in the experimental colitis model38, were down-regulated upon HFD (Fig.S2B). This suggests that the effects of HFD on colon gene expression are robust. The host gene-gut microbiome association is critical to intestinal disorders39, therefore we investigated the association between transcriptome and microbiome and the effect of dietary challenge on their interaction. We performed Procrustes analyses using paired data in each diet and observed a significant concordance between host gene expression and gut microbiome composition across BXD strains under CD (P value = 0.013, Fig. 2C), but not HFD (P value = 0.376, Fig. 2D), suggesting that HFD may disrupt the overall host gene-microbiome communication.
Uncovering HFD-specific host gene-gut microbiome associations
Although there is no overall association between host colon genes and gut microbiome under HFD, a group of gut bacteria and host genes can still be associated. To uncover the HFD effects on the group-level associations between the colon transcriptome and the gut microbiome, we applied sparse canonical correlation analysis (CCA)40,41, a machine learning-based approach that has been used in multi-omics integration41 (e.g., gene expression and microbiome datasets39) to detect associations between subsets of bacteria and host colon genes under CD or HFD conditions (Fig. 3A). The significant associations between bacteria genera and host gene groups were termed as components (C). Enrichment analyses were then performed on host genes within each component to unveil the broad biological functions of each component (Fig. 3A). The first 10 sparse CCA components were computed in each diet, and 6 components were significantly correlated in CD (Fig. 3B), while only 3 significantly correlated components were found in HFD (Benjamini-Hochberg (BH)-adjusted P value < 0.05) (Fig. 3C).
The bacteria genera within component 1 under CD (C1_CD), includes Faecalibaculum, HT002, Lactobacillus, and Turicibacter (Fig. 3B), with the most correlated genus being Lactobacillus (CCA coefficient: -0.67). Lactobacillus has been shown to modulate cell growth42, and indeed, C1_CD was related to genes involved in cell proliferation and cellular stress, notably MYC targets v143, oxidative phosphorylation (OXPHOS)44, and heat shock factor 145 (HSF1) targets (Fig. 3D). Genes in C2_CD were associated not only with MYC targets v1 and OXPHOS, but also with DNA repair, the dysregulation of which can induce metabolic disorders46. Consistent with this, the most correlated genus in this component, Lachnospiraceae NK4A136 group (CCA coefficient: 0.89) that belongs to Lachnospiraceae family, was negatively correlated with obesity-related biomarkers in mice47. Under HFD condition, bacteria in the C6_HFD were significantly related to the mitochondrial unfolded protein response (UPRmt) mediators, while those in the C7_HFD were significantly associated with inflammatory responses (Fig. 3E). The same bacteria, e.g., Lachnospiraceae NK4A136 group and Lachnospiraceae FCS020 group, are related to genes involved in cell proliferation under CD and inflammation under HFD (Fig. 3D-E). This observation suggests that the same bacterium can be involved in different biological processes by interacting with different bacteria and host genes39. Taken together, we found two subsets of bacteria in CD that are related to cell proliferation and OXPHOS, while the inflammation-related bacteria group was identified in HFD, suggesting that specific bacteria are either a cause or a consequence of host inflammation upon HFD feeding.
HFD feeding reshapes the associations between gut microbiome and BAs
Emerging evidence has shown that the gut microbiome contributes to metabolic homeostasis or metabolic disorders through its interaction with BAs18,23,48. The ratio between primary BAs (produced by the host in the liver) and secondary BAs (modified by the gut microbiome) gives an indication of the actions that the gut microbiome imposes on the BA pool. In CD, a positive correlation was found between the Shannon index and the secondary/primary BA ratio in the feces and liver, but not in the plasma, and these correlations were absent upon HFD (Fig. 4A). This may be because HFD feeding decreases the proportion of secondary BAs and increases that of primary BAs9,49, and HFD decreases the alpha diversity of gut microbiome31,32 (Fig. 1D), further affecting the overall association between these two layers (Fig. 4A). Likewise, we found several diet-dependent and tissue-dependent correlations. In particular, the abundance of Muribaculum, Peptococcus, and Bifidobacterium were positively correlated to secondary/primary BA ratio in all biological compartments analyzed, while positive correlations between Desulfovibrio and secondary/primary BA ratio were only observed in the liver and plasma under CD. A positive correlation between feces and plasma secondary/primary BA ratio with Rikenellaceae RC9 gut group, as well as a negative correlation between liver and plasma secondary/primary BA ratio with Lachnoclostridium were also demonstrated under HFD (Fig. 4A). In addition, we uncovered three positively conserved correlations in both diets: secondary/primary BAs in the plasma and liver and Peptococcus, secondary/primary BAs in the feces and liver and Odoribacter, as well as secondary/primary BAs in the plasma and Helicobacter (Fig. 4A). These results suggest that these genera might be involved in the regulation of the amount of these functionally different BA classes.
To establish the potential links between the abundance of each bacterium and the level of each BA, we performed lasso penalized regression and Pearson correlation on these paired data (Fig. 4B). We fitted the lasso penalized regression using the level of each BA as the response variable and the abundances of gut microbial taxa as predictors. 30 and 15 robust bacterium-BA associations (indicated by + or ++) were identified by stability selection under CD and HFD, respectively (Fig. 4C-D). Certain conserved associations across both diets were identified, such as NK4A214 group-Taurodeoxycholic acid (TDCA) and Faecalibaculum-Chenodeoxycholic acid (CDCA) in plasma. However, most associations were diet specific, indicating that HFD shifts the bacterium-BA associations (Fig. 4C-D). In CD-fed mice, Turicibacter was consistently positively correlated with CA in plasma at different time points (Fig. 4C). We also uncovered other associations, for example, GCA-900066575 was positively linked with CDCA in livers in CD (Fig. 4C), while negative associations under HFD included Bilophila-Tauro-β-muricholic acid (TβMCA) and Bacteroides-Taurochenodeoxycholic acid (TCDCA) in plasma at 30 minutes after a test meal gavage (T30) (Fig. 4D). Overall, these analyses provide insights into the gut microbiome-BA crosstalk and highlight potentially novel bacteria-BA associations.
Dissecting the genetic loci regulating gMxBs across diets
In addition to dietary effects, host genetics also play crucial roles in regulating gut microbiome and BA composition. We asked whether certain host genetic factors affect both layers, potentially indicating a common cause. We performed QTL mapping on the cecal microbiome and identified two QTL peaks located on chromosome (Chr) 4, which were related to the abundance of Turicibacter under CD (Fig.S3A), and one peak located on Chr 7 that was related to Bacteroides in both diets (Fig.S3A and B). Under HFD, QTL peaks for Alistipes and Lachnoclostridium were located on Chr 7, and those of Bacteroides and Oscillibacter were on Chr X (Fig.S3B). In parallel, we repeated the QTL mapping for absolute and relative (as a fraction of the total, BA X /all) abundance of each BA in each biological compartment9 with the latest version of the genotype matrix. By comparing the BA and microbiome QTL peaks (Table 1 and Supplementary information 1), we found two shared genetic loci on Chr 4 that were associated with the abundance of Turicibacter and BAs under CD. The first one (gMxB1 QTL) affected the CA/all ratio in plasma and liver as well as CA, Chenodeoxycholic acid (CDCA), and the Taurocholic acid (TCA)/all ratio in plasma. The second one (gMxB2 QTL) was only associated with CA/all ratio in the liver (Fig. 5A-B), which is consistent with the positive correlation between Turicibacter abundance and total CA and CDCA in plasma (Fig. 4C). Under HFD, another shared genetic locus (gMxB3 QTL) on Chr 7, was associated with Bacteroides and two fecal secondary BAs: 7-keto Lithocholic acid (7-keto-LCA) and 7-keto Deoxycholic Acid (7-keto-DCA)/all (Fig. 5C-D). The nearby, but not overlapping, locus on Chr 7 (gMxB4 QTL) was associated with the abundance of Lachnoclostridium and the amount of α-Muricholic acid (α-MCA), Taurolithocholic acid (TLCA), and Ursodeoxycholic acid (UDCA) under HFD (Fig. 5C and E).
Table 1
Co-mapping QTL peaks across both bacterial abundance and BA profiles under CD and HFD, related to Fig. 5A-E. Chr, chromosome; Pos, position; LOD, logarithm of the odds; proximal and distal regions are in Mbp.
QTL | Genus | Chr | Pos (Mb) | LOD | Proximal (Mb) | Distal (Mb) | Diet | Bile Acids |
gMxB1 | Turicibacter | 4 | 62.65 | 4.29 | 46.35 | 68.27 | CD | Plasma: CA, CDCA, CA/all, TCA/all Liver: CA/all |
gMxB2 | Turicibacter | 4 | 74.25 | 3.77 | 68.27 | 88.04 | CD | Liver: CA/all |
gMxB3 | Bacteroides | 7 | 56.50 | 3.98 | 55.65 | 63.44 | HFD | Feces: 7-keto-LCA, 7-keto-DCA/all |
gMxB4 | Lachnoclostridium | 7 | 45.80 | 4.43 | 40.92 | 46.33 | HFD | Feces: α-MCA, TLCA, UDCA |
Table S1
List of the BAs measured, their abbreviations, accurate m/z ratio, corresponding internal standards (used for quantification) and retention times (RT). Chromatographic resolution was essential for the separation of isobaric species.
Bile Acid | Abbreviation | H− (m/z) | Internal Standard | RT min |
Lithocholic acid | LCA | 375.29047 | d4-LCA | 20.6 |
7-Ketolithocholic acid | 7Keto-LCA | 389.26973 | d4-GDCA | 15.5 |
Murocholic acid | MDCA | 391.28538 | d4-CA | 12.4 |
Ursodeoxycholic acid | UDCA | 391.28538 | d4-GDCA | 13.3 |
Chenodeoxycholic acid | CDCA | 391.28538 | d4-CDCA | 18.3 |
Deoxycholic acid | DCA | 391.28538 | d4-DCA | 18.5 |
Isodeoxycholic acid | isoDCA | 391.28538 | d4-GDCA | 19.7 |
3-Oxocholic acid | 3Oxo-CA | 405.26465 | d4-TCDCA | 12.6 |
7-Ketodeoxycholic acid | 7Keto-DCA | 405.26465 | d4-GCA | 11.0 |
α-Muricholic acid | α-MCA | 407.28030 | d4-GCA | 10.8 |
β-Muricholic acid | β-MCA | 407.28030 | d4-GCA | 11.2 |
ω-Muricholic acid | ω-MCA | 407.28030 | d4-GCA | 10.4 |
γ-Muricholic acid | MCA | 407.28030 | d4-GCA | 11.8 |
Cholic acid | CA | 407.28030 | d4-CA | 12.6 |
Glycolithocholic acid | GLCA | 432.31193 | d4-GDCA | 13.5 |
Glycoursodeoxycholic acid | GUDCA | 448.30685 | d4-GCA | 10.2 |
Glycohyodeoxycholic acid | GHDCA | 448.30685 | d4-CA | 10.6 |
Glycochenodeoxycholic acid | GCDCA | 448.30685 | d4-GCDCA | 12.9 |
Glycocholic acid | GCA | 464.30176 | d4-GCA | 10.3 |
Taurolithocholic acid | TLCA | 482.29457 | d5-TLCA | 13.0 |
Tauroursodeoxycholic acid | TUDCA | 498.28948 | d4-TCA | 5.1 |
Taurohyodeoxycholic acid | THDCA | 498.28948 | d4-TCDCA | 5.5 |
Taurochenodeoxycholic acid | TCDCA | 498.28948 | d4-TCDCA | 10.6 |
Taurodeoxycholic acid | TDCA | 498.28948 | d5-TDCA | 10.9 |
Tauro-α-muricholic acid | TαMCA | 514.28440 | d4-TαMCA | 2.1 |
Tauro-β-muricholic acid | TβMCA | 514.28440 | d4-TβMCA | 2.3 |
Taurocholic acid | TCA | 514.28440 | d4-TCA | 5.8 |
Supplemental information |
Supplemental information 1. QTL peaks of both bacterial abundance and BA profiles within identified gMxBs under CD and HFD, related to Fig. 5A-E and Table 1. |
Different studies have demonstrated the roles of gMxB in the development of non-communicable diseases7,22,50, such as metabolic dysfunction-associated fatty liver disease51 or colorectal cancer52. To dissect the effects of gMxB on metabolic traits, we first checked the correlation between these three bacteria genera, Turicibacter, Bacteroides, and Lachnoclostridium, and the phenotypic traits in BXD mice and found that the abundance of Turicibacter was positively correlated with the cecum weight adjusted by total body weight (% of BW) and body weight loss under CD. In addition, the abundance of Bacteroides was negatively correlated to the cecum (% of BW) and BW loss under CD, and it was also negatively associated with the spleen (% of BW) in HFD (Fig. 5F). In line with these observations, in the differential abundance analyses, we found that the abundance of Turicibacter was increased in mice fed with CD (lean mice), whereas that of Bacteroides was increased under HFD (obese mice) (Fig. 2A-B). In addition, the abundance of Lachnoclostridium was positively associated with cecum (% of BW) under CD and soleus (% of BW) under HFD (Fig. 5F). These findings suggest that the three bacteria, Turicibacter, Bacteroides, and Lachnoclostridium, may regulate metabolic traits, in a diet-dependent manner.
Prioritizing the potential regulators of gMxBs and unveiling their biological roles in humans
To further investigate the candidate genes that potentially regulate these gMxBs, we filtered genes under these four gMxB QTLs using the following criteria: (1) having known segregating variants in the BXD strains predicted to have moderate/high impacts (orange), and (2) highly (in brownish red) or specifically (in red) expressed in intestine obtained from the human protein atlas53 (https://www.proteinatlas.org/humanproteome/tissue/intestine) (Fig. 6A). For the 4 identified gMxB QTLs, we identified respectively 9, 6, 1, and 48 genes that satisfied these two criteria and were considered candidate genes that might mediate the corresponding gMxB effects in BXD mice (Fig. 6A-B). To further prioritize these candidate genes, we retrieved suggestive (p < 10− 4) and significant (p < 10− 8) genetic variant-bacterium associations of human GWAS results from MiBioGen54. We then performed MR analysis to explore the causal effects of the gene expression of candidates in human colons on the corresponding bacteria genera (BH-adjusted P value < 0.05), using this human GWAS result54 as outcome and expression QTLs in the human GTEx55 as exposure (Fig. 6A-B). For gMxB1, the gene expression of prostaglandin reductase 1 (PTGR1) in human sigmoid colons had a causal effect (Beta = 0.09 and BH-adjusted P value < 0.05) on fecal Turicibacter abundance based on MR analysis (indicated in dark green, Fig. 6B). For gMxB2, we found that genetic variants within protein tyrosine phosphatase receptor type D (PTPRD) are suggestively associated with the abundance of Turicibacter in human GWAS analysis (indicated in blue, Fig. 6B). However, PTPRD did not have a significant eQTL in human colons (indicated in light green), preventing us from exploring the causality of the gene expression of PTPRD (Fig. 6B). Furthermore, the expression of these two candidate genes were positively correlated to the genes involved in BA metabolism (Fig. S4). These results highlight Ptgr1 and Ptprd as candidate mediators of the effects of gMxB1 and gMxB2, respectively, under CD (Fig. 6B).
Candidate genes identified under the HFD gMxB3 and gMxB4 QTLs were not found to have known associations with Bacteroides or Lachnoclostridium in human GWAS and MR analyses, preventing us from further prioritizing genes under these QTLs using these two approaches. Instead, we selected gamma-aminobutyric acid type A (GABAA) receptor subunit beta3 (Gabrb3), the only candidate gene that has high-impact genetic variants in BXDs and eQTLs in human colon and is highly expressed in human intestines, as a potential modulator regulating gMxB3. For gMxB4 QTL, 18 candidate genes with eQTLs in human colons were identified under this QTL peak, but we could not further shortlist them. It is possible that they interact with each other and then modulate the gMxB4 by synergy.
To better understand the roles of the three putative BA-microbiome co-regulators PTGR1, PTPRD and GABRB3 in humans, we examined their associations with clinical traits and diseases in three populations–the FinnGen study56, the million veteran program (MVP)57, and the UK Biobank (UKBB)58,59. Genetic variants within or near PTPRD and GABRB3 were associated with several metabolic disorders, such as heart failure, diabetes, or liver fibrosis, in the FinnGen and MVP studies (Fig. 6C-D). Variants within PTGR1 were related to colorectal cancer in the FinnGen study (Fig. 6C). Moreover, genetic variants within these three candidates were suggestively (p < 10− 4) associated with several metabolic traits in the UKBB, including cholesterol-related traits and fat mass (Fig. 6E-G). Genetic variants within PTPRD were suggestively associated with heart failure in all three human populations, an effect that may be mediated by Turicibacter abundance or CA, CDCA, or TCA levels. The abundance of Turicibacter has previously been linked with risk factors of CVDs, like serum cholesterol60. Similarly, CDCA or CA treatments can reduce plasma triglycerides and increase low-density lipoprotein cholesterol61, which may explain the effect of the gMxB on CVDs. Overall, these data, together with the co-localized QTL results, suggest that changes in the BA-gut microbiome crosstalk can influence metabolic or cardiovascular health.