Breastmilk samples (colostrum) from 43 individuals aged 20-32 years from the Monterrey metropolitan area were used in this study. All deliveries were at term with a mean of 39.4 weeks of gestation. A total of 18 samples from mothers with BMI ≤ 24.9 kg/m2 (non-obese), without GDM were considered as controls. Twelve milk samples were collected from mothers with obesity (defined as BMI ≥ 30 kg/m2) and without GDM, as well as 13 samples were obtained from mothers with GDM and a BMI ≤ 29.9 kg/m2. None of the participants in the gestational diabetes study group had obesity; however, eight of them were overweight (BMI ≥ 25 kg/m2). Most of our participants were multiparous, and regarding delivery, 24 neonates were born by caesarean section and 25 of included women were given antibiotics during labor. The sex distribution of the newborn was 48.8% of males and 51.16% of females. Clinical and demographics of the participants are summarized in Table 1.
We classified data according to the pathophysiological condition and the newborn gender resulting in a total of six study subgroups, namely: obesity positive, GDM negative - female (Ob-F; n=8); obesity positive. GDM negative - male (Ob-M; n=4); GDM positive obesity negative - female (GD-F; n=6); GDM positive obesity negative - male (GD-M; n = 7), healthy normal weight - female (NW-F; n = 8), and healthy normal weight - male (NW-M; n = 10).
After birth, breastmilk was collected within the first 24 hours and colostrum was appropriately stored until analysis. Using NGS, 2,635,830 high quality reads were obtained with a mean of 61,298 ± 16,928 sequencing reads per sample. After the removal of possible contaminants and rare taxa (≤ 25 reads in total), 1,675 amplicon sequence variants (ASVs) were assigned at 29 phyla, 57 classes, 96 orders, 150 families, 224 genera and 126 species using Greengenes database with a confidence identity level set at 99%. We found increased prevalence of Pseudomonas, Gemellales and Enterobacter in samples from participants with GDM and obesity.
Colostrum compositional microbiota was dominated by Proteobacteria and Firmicutes in all the samples.
We found that breastmilk samples were overrepresented by Proteobacteria with a relative abundance mean of 63.3% ± 26.5% (range 17.5% - 97.1%) and Firmicutes (21.1% ± 20.7%, range 0.5% - 60.2%). Less represented phyla were Bacteroidetes (9.4% ± 12%, range 0.3% - 35.9%), Actinobacteria (3.5% ± 4.7%, range 0.2% - 24.1%) and WPS-2 (1.8% ± 5.2%, range 0% - 21.4%) (Figure 1A). The classification of “Other” represents phyla with less than 1% of total relative abundance.
Bacteria belonging to the Proteobacteria phylum was more abundant in obesity-male (Ob-M, 88.1% ± 10.1%) but less abundant in obesity-female (Ob-F, 48.9% ± 27.3%). The highest relative abundance of Firmicutes was shown for obesity-female (Ob-F, 38.8% ± 17.9). Actinobacteria showed higher abundance in all the female groups compared to their male counterpart. Bacteroidetes were more abundant in healthy groups with 13.7% ± 14.2% and 11.8% ± 15.7 for female (NW-F) and male (NW-M) respectively. This behavior was also observed for WPS-2, a recently candidate division, which had the highest relative abundance in healthy-female (NW-F, 5% ± 9.2%) (Figure 1A).
Pseudomonas, Gemellales and Enterobacter are overrepresented in samples of individuals with obesity and GDM
Overall, the most abundant genera were Pseudomonas (22.4% ± 24%), Gemellales (7.6% ± 10.3), Ralstonia (6.7% ± 10.6%), Herbaspirillum (5.1% ± 8.6%), Streptococcus (4.9% ± 11.8%), Enterobacteriaceae (4.5% ± 3.5%), Chryseobacterium (4.2% ± 9.7%) and Sphingomonas (4% ± 9.7%) (Figure 2). The “Other” category represents the taxa with less than 1% overall. According to the subgroup classification, in the healthy normal weight subgroups, Chryseobacterium (p < 0.05) and Sphingomonas (p < 0.10) were more abundant (9% ± 15.1%, 6.1% ± 6.2% for female (NW-F) and 9% ± 13.5%, 5.9% ± 5.7% for male (NW-M) respectively). While not statistically significant, obesity-male (Ob-M) and GD-male (GD-M) have the higher prevalences of Pseudomonas, with 55.7% ± 6.6% for obesity and 23.6% ± 23.5 for GDM. A similar pattern was also observed for Enterobacter (p < 0.05; Ob-M, 12.3% ± 2.2% and GD-M, 3.7% ± 4.8). Streptococcus presence was enriched in the obesity-female subgroup (Ob-F, 10.9% ± 13.8%). Breastmilk from subjects with GDM showed higher prevalence of “Other genera” (24.9% ± 12.3%, 18.3% ± 12.3% for female and male respectively), suggesting that the biggest part of this contribution to the relative abundance of both subgroups is due to genera with less than 1% in total. This is supported by the rarefaction curves which reveal that GDM–female (GD-F), GDM–male (GD-M) and obesity-female (Ob-F) had higher values of estimated number of observed ASVs (Figure 1B). Gemellales were more abundant in subgroups with female baby (p < 0.10), distinctively in obese (Ob-F, 13.6% ± 14.2%) and with gestational diabetes (GD-F, 12.9% ± 11.5%). A full distribution of taxa is show in Figure 2.
Alpha and beta diversity metrics show a distinct microbial composition for GD-F, Ob-M and newborn gender-related samples
We used a general linear model (glm) using alpha diversity metrics at a sequencing depth of 23,572 (data not shown) in order to quantify the influence of GDM, obesity (BMI ≥ 30 kg/m2), cervicovaginitis, antibiotic exposure, multiparity and sex of the baby. As a result of the analysis, only maternal physiopathology (GDM, obesity and healthy) and the sex of the baby showed statistically significant association (p ≤ 0.05) for Shannon index, phylogenetic diversity and observed ASVs (Figure 3A-C). GDM subgroups presented the highest values in all alpha indexes. In addition, our results suggest that, in general, female subgroups had higher diversity compared to male subgroups. Fisher’s comparisons indicate that statistical difference was only significant between healthy-female (NW-F) and GDM-female (GD-F) for Shannon index. Breastmilk samples from obesity-male subgroup (Ob-M) had the lowest levels of alpha diversity and were statistically different to all of the subgroups, including their female counterpart (Figure 3A-C).
We estimated microbiome beta diversity using the unweighted UniFrac distance (Figure 3D). Our results show that obesity-male (Ob-M) and healthy-male (NW-M) subgroups cluster separately from the rest of the samples (PERMANOVA; p = 0.001; 999 permutations). Using the unweighted distance matrix, we generated a PCoA biplot in order to show that the clustering was significant for obesity-male (Ob-M; p ≤ 0.05) and GDM-female (GD-F; p ≤ 0.05) compared to healthy-male (NW-M). Arrows in the plot represent the correlation at family level with PCoA axes, indicating their contribution to the variation (Figure 3D). While samples from GDM-female (GD-F), GDM-male (GD-M), healthy-female (NW-F) and obesity-female (Ob-F) show high similarity regarding microbial composition, the unweighted measurement indicates that there is a phylogenetic difference between obesity-male (Ob-M) and the rest of subgroups (p ≤ 0.05).
We used the beta-diversity compositional Aitchison’s distance in order to assess the compositional nature of data (PERMANOVA; p = 0.027; 999 permutations) (Figure 3E). The robust principal component analysis (RPCA) biplot, which allows to examine the variation of samples and taxa, did not show a clear separation of any subgroup. PERMANOVA tests and pairwise comparisons only showed obesity-male (Ob-M) was different to GDM-female (GD-F; p ≤ 0.05) and that obesity-female subgroup (Ob-F) was statistically different to its male counterpart (Ob-M) and both healthy subgroups (p ≤ 0.05). The 10 taxa presented as vectors in the plot are the most significant drivers of the location of samples (Figure 3E).
Breastmilk core and differentially abundant taxa
We defined the breastmilk core microbiota as taxonomical families present in all samples with a minimum 1% of total mean relative abundance. Overall, 9 families were identified as the core taxa and comprise 69% ± 23.3% of the total (Table 2). The most abundant were Pseudomonadaceae with a general mean of 22.4% ± 24%, followed by Oxalobacteraceae (11.8% ± 10.9%) and Enterobacteriaceae (11.3% ± 7.2). These results demonstrate the high variability of the core bacteria among subgroups and individuals. The five most abundant families belonging to the core, were found to describe the majority of the variation in the ordination space observed in the unweighted PCoA biplot, and were represented as arrows (Figure 3D). However, no clear participation of families to the formation of subgroups was visualized with the implement of UniFrac metrics.
We used the Aldex2 tool (Fernandes et al., 2013) in order to identify differences in ASV abundance between subgroups. We determined the taxa that were driving the difference between the subgroups and obtained effect plots (based on the effect size), which allowed us to visualize if the variation was higher between or within subgroups. Given the high variability amongst samples, we only observed differentially abundant ASVs with a significant expected Benjamini-Hochberg corrected p value of Welch’s t test (q ≤ 0.1) in three sample pairs (Figure 4). In the GDM-female (GD-F) vs healthy-female (NW-F) the family Rhodococcaceae was different (Figure 4A). In obesity-male vs healthy-male (Ob-M vs NW-M) we found a total of 7 families (Bdellovibrionaceae, Burkholderiaceae, Halomonadaceae, Pseudomonadaceae, Shewanallaceae, TM7-1 and Vibrionaceae) with an absolute effect size greater than 1, of which 5 had a q-value ≤ 0.05, and only 1 was part of the breastmilk core (Figure 4B). This phenomenon was also observed in obesity-male vs obesity-female (Ob-M vs Ob-F), in which only 4 of the (Comamonadaceae, Enterobacteriaceae, Pseudomonaceae and Sphingobacteriaceae) 12 significant different taxa found corresponded to the core families (Figure 4C). Based on the median difference between subgroups, we observed that in all comparisons, obesity-male (Ob-M) had significant higher abundance for differential taxa found. In addition, GDM-female (GD-F) had higher prevalence of Rhodococcaceae compared to healthy-female (NW-F).