Sequencing coverage and bacterial diversity in the gut of BSFL
The primary objective of this study was to examine the composition of the gut microbiota in relation to genetics, diet, and larval age, and to decipher the involvement of specific taxa in the use of dietary macronutrients and larval performance. Approximately 30,000 raw reads per sample were collected from Illumina MiSeq. After thorough quality assessments and filtering, the average sequence yield per sample, suitable for further analysis, was 22,777 ± 10,053 reads (standard deviation (SD)). Good's coverage was determined to be 99.99% ± 0% as calculated from the QIIME sample-frequency output at the genus level, and 99.98% ± 0% at the OTU level. This metric provides insight into the completeness of sampling in capturing the bacterial diversity present in the dataset. Shannon alpha-diversity was evaluated based on diet, age of larvae, and genetic line (Figure 1).
Figure 1. Bacterial community diversity of BSFL as assessed using the Shannon α diversity metric before (D0) and after the introduction of test diets (D5, top; D10, bottom) for Line D (LD, filled circles) and the wild-type line (WT, open circles). The diets are arranged from most homogeneous diet types (CF to P) on the left to most heterogeneous diet types (MHP to NUS) on the right. Diet abbreviations are provided in Table 1.
While overall, a slightly higher diversity was observed for LD as compared to WT (p = 0.018; Student’s t-test), the differences decreased with larval age (D5: p = 0.08) and were insignificant at D10 (D10: p = 0.21). Similarly, no significant differences in bacterial richness were observed between experimental days for each genetic line (p > 0.05). However, differences in bacterial richness were observed between genetic lines fed on the same diet as well as within lines between certain diets (p < 0.05). Higher bacterial richness was observed on plant-based (CF, P, PKM, MLP, RIB, SBM and OKA) as compared to meat-based diets (M, MHP and NUS; p < 0.001). This was observed independently for both genetic lines. To further investigate the differences in alpha-diversity between individual diets and genetic lines, pairwise Wilcoxon rank-sum tests were performed, and results corrected for FDR. In both WT and LD, significant differences were observed between certain plant-based diets and meat-based diets; for example, MLP, OKA, P, and PKM (both lines), as well as RIB (only LD) were significantly different to M (all q < 0.05), and MLP, and P (both lines), as well as OKA (only LD) were significantly different to MHP (all q < 0.05). In WT however, no significant differences were observed between CF and other diets (all q > 0.05). Furthermore, differences were observed even between individual plant-based diets in WT whereby MLP induced a higher bacterial richness than RIB and SBM (both q < 0.05). A similar trend was observed in LD as well, however, with interesting additional differences observed between diets. Within plant-based diets, there was a difference in richness induced by CF as opposed to MLP and P, and P as opposed to RIB and SBM (all q < 0.05). Notably, the bacterial richness induced by P (heterogeneous pure plant diet) was significantly higher in LD as compared to WT (q < 0.05). Amongst all the diets tested, NUS (the most heterogenous diet) induced the least bacterial richness consistently across both genetic lineages.
Influence of genetics and larval age on gut bacterial community composition
We further aimed to investigate how genetic and environmental factors influenced bacterial community composition in the larval gut. Overall, Firmicutes D and Proteobacteria were the most abundant phyla with relative abundances of 42.3% ± 18.8% and 34.4% ± 19.5% in LD and 44.4% ± 18.2% and 29.3% ± 18.3% in WT, respectively, followed by Firmicutes A (LD: 7.5% ± 9.9%, WT: 11.3% ± 14.4%), Actinobacteroidota (LD: 7.7% ± 10.4%, WT: 6.6% ±11.8%), and Bacteroidota being the least abundant phylum (LD: 7.3% ± 10.1%, WT: 6.6% ± 7.8%). Firmicutes D and Proteobacteria were predominantly present in D0 samples, together accounting for 77.6% ± 4.7% and 80.5% ± 11.4% in LD and WT bacterial communities, respectively. It was obvious that, despite having been fed the same diet, the bacterial communities of each of the genetic lineages were enriched in different phyla. For example, when fed on meat-based diets, Proteobacteria were more abundant in LD as compared to WT (LD: 55.9% ± 15.1%, WT: 44% ± 13%, p = 0.016). Inversely, Firmicutes D were significantly more abundant in WT larvae when fed on meat-based diets (LD: 36.3% ± 13.9%, WT: 45.3% ± 11.2%, p = 0.041). Campylobacterota appeared to be a rare taxon in general (maximum relative abundance of only 6.7% across all samples). However, interestingly, this phylum was observed in most samples collected from LD regardless of the diet being offered (except for CF and NUS), while absent from WT.
Multivariate analysis was done using bacterial profiling data at genus-level resolution to determine if genetic lineage, larval age, or diet alone explained gut bacterial differences, or if they exerted combined effects in shaping the BSFL bacterial communities. Interestingly, the combined effect of genetic line and larval age showed the (p = 0.001, residual degrees of freedom = 80). Based on the Bray-Curtis dissimilarity metric, average within-line dissimilarity was higher for WT than LD (0.78 ± 0.15 vs. 0.73 ± 0.14; p < 0.001 (Student’s t-test)). PCoA was used to further visualise genetic differences and temporal changes in bacterial community structure across different diets (p = 0.001, Table S2; Figure 2).
Figure 2. Principal Coordinates Analysis of the BSFL bacterial gut microbiota in two different BSF populations feeding on ten different diets and sampled at three different time points. The Bray-Curtis dissimilarity metric was used. Refer to Table 1 for the details of the diets. The filled circles depict LD and open circles depict WT.
D0 samples from the two genetic lineages showed distinct clustering. D5 and D10 samples from the two genetic lineages, fed on P, MLP, and M (same substrates, different combinations), showed similar bacterial community compositions respective to their genetic lineages, suggesting a divergence of bacterial communities in the two lines resulting from differential adaptation to these substrates. However, D5 and D10 samples from larvae fed on MHP clustered together irrespective of genetic lineage, except for D5 WT larvae, which clustered away from the other samples. For two of the diets, PKM and RIB, D5 samples displayed proximity to D0 samples, rather than D10 samples as observed for the remainder of diets. This suggested a slower adaptation of bacterial communities to these fibre-rich plant substrates. Overall, the diets showed to be an important driving factor resulting in some larvae displaying stronger genetic effects and others displaying stronger age effects in the shaping of their gut microbiota.
SIMPER (Similarity Percentage) analysis, a multivariate statistical approach, revealed specific bacterial genera underlying the observed differences between the genetic lineages (Table 2; Table S3).
Table 2. SIMPER (Similarity Percentage) analysis was used to identify genus-level taxa that showed significantly different relative abundances between the two genetic lineages across the tested diets over experimental days (D5 and D10). The table displays the average relative abundances ± SD (%) of significant bacterial genera for each genetic line. Only those diets are shown on which certain bacterial genera were differentially enriched between the genetic lines. The significance codes (Signif. codes) are provided for the corresponding p-values. The complete list of diets is presented in Table S3
Levilactobacillus, Ligilactobacillus, and Morganella were present in higher abundances in LD compared to WT (all p < 0.05), while Paenibacillus_J and Lacrimispora were more abundant in WT than LD (all p < 0.05). Further investigation of the most influential genera on genetic lineage based on diets showed Morganella to be significantly higher in MHP fed LD larvae (LD: 21% ± 15.9%, WT: 5.4% ± 5.1%; p < 0.05; Table 2). Interestingly, within the genus Paenibacillus, Paenibacillus_C and Paenibacillus_J appeared to be differentially abundant in P and MLP fed larvae. In contrast to Paenibacillus_J, Paenibacillus_C was significantly more abundant in LD when fed on P or MLP (both p < 0.001). However, no significant differences were observed between genetic lineages when larvae were fed on CF, RIB, PKM, NUS, and M.
Following the same statistical approach, we next identified genera that were significantly associated with the age of the larvae (i.e., D5 and D10), regardless of lineage and diet. Lactiplantibacillus (D5: 11.4% ± 15.9%, D10: 0.5% ± 1.2%; p < 0.001), Klebsiella (D5: 3.2% ± 5.9%, D10: 0.7% ± 1.3%; p < 0.01), Levilactobacillus (D5: 2.9% ± 3.4%, D10: 0.6% ± 1.1%; p < 0.001), and Lysinibacillus (D5: 2% ± 5%, D10: 0.6% ± 1.1%; p < 0.05) exhibited significantly higher abundance in D5 samples as compared to D10 samples. Conversely, Dysgonomonas (D5: 1.0% ± 2.6%, D10: 5.1% ± 7.8%; p < 0.001), Ignatzschineria (D5: 0.01% ± 0.006%, D10: 2.2% ± 6.7%; p < 0.05), Scrofimicrobium (D5: 0.7% ± 2.3%, D10: 11.4% ± 13.2%; p < 0.001), Sphingobacterium (D5: 0.2% ± 0.8%, D10: 2.6% ± 5.5%; p < 0.01), and Morganella (D5: 3.9% ± 7%, D10: 8.3% ± 10.5%; p < 0.01) were significantly more abundant in D10 samples.
Influence of diet on gut bacterial community composition and correlation of bacterial taxa with dietary macronutrients
Multivariate analysis also revealed a significant effect of diet on the composition of bacterial communities in the gut of individual BSFL populations (p = 0.001, Table S2). Members of Firmicutes A, for example, were most highly abundant in MHP, MLP, P (similar substrates at different ratios) and PKM diets (>15% in relative abundance whilst their abundance in other diets was below 5%; Figure 3A).
Figure 3. Relative abundances of the most abundant bacterial taxa found in the gut of BSFL fed on the ten different diets. A) Relative abundances of most dominant phyla across all tested diets. B) Relative abundances of top 15 most abundant bacterial genera across all tested diets. “Other” represents all taxa that showed < 1% relative abundance in all samples. Error bars depict the 95% confidence interval, the lower and upper boundaries of the box represent the 25th and 75th percentiles respectively, the line within the box signifies the 50th percentile (median), and any outliers are denoted by open (WT) or filled circles (LD).
In addition, Proteobacteria were higher in abundance in meat-based diets, and lower in plant-based diets (meat-based: 50% ± 15.1%, plant-based: 25% ± 15.5%, F(1, 117) = 66.2, p < 0.001), whereas Actinobacteroidota and Bacteroidota were observed in plant-based diets, but were almost absent from meat-based diets (plant-based: 10.1% ± 12.3% and 10% ± 9.6%, respectively; meat-based: 0.4% ± 0.3% and 0.9% ± 1.9%, respectively, F(1, 117) = 24.5 p < 0.001, F(1, 117) = 31.5 p < 0.001).
Not only did we detect an effect of diet category (meat-based vs. plant-based) on shaping the gut bacterial communities (Figure 3B) but dietary macronutrients in the diets also appeared to be significantly correlated with certain bacterial genera (Table S4, Figure S1) resulting in different gut bacterial communities between diets (RDA analysis, Figure 4). For example, both types of plant-based diets, MLP and P, appeared to harbour members of Dysgonomonas (MLP: 11.9 ± 9.0% and P: 11.8% ± 8.6%), Lacrimispora (MLP: 11.9% ± 7.6%, P: 8.4% ± 6.1%), Paenibacillus_J (MLP: 7.5% ± 7.7%, P: 14.3% ± 7.2%) and Paenibacillus_C (MLP: 3.1% ± 3.2%, P: 6% ± 7.1%), whereas these genera were present at considerably lower relative abundances in samples from all other diets (<2.5%). This was in congruence with the RDA results, whereby an association of these four distinct genera was observed with fat and protein content of the diet (Figure 4, Figure S1).
Figure 4. Redundancy analysis (RDA) biplot showing the correlation between bacterial taxa and dietary macronutrients. The taxonomic data were transformed using the Hellinger transformation with a scaling factor of 2. Dietary macronutrients as well as the top 10 bacterial taxa are illustrated as arrows, denoting both their direction and the degree of association with the samples. Sample positions relative to these arrows indicate their similarity or dissimilarity based on the variables. Arrows pointing towards sample clusters indicate strong influence of a variable on those samples. The angles between arrows reflect correlations among variables; smaller angles denote stronger correlations, while larger angles signify weaker ones. Taxa situated in the direction of the macronutrients may be significantly impacted by them.
Genera such as Scrofimicrobium and Sphingobacterium seemed to be more prominent in larvae fed on plant- as compared to those fed on meat-based diets (meat-based: 0.3% ± 0.3% and 0.1% ± 0.4%, plant-based: 8.9% ± 12.2% and 2% ± 4.8% respectively, F(1, 117) = 17, p < 0.001 and F(1, 117) = 5.8, p < 0.05; Table 1, Figure 3B). Sphingobacterium also displayed a strong positive association with carbohydrate content in the homogeneous plant-based diets. Inversely, Providencia was generally higher in abundance in meat-based diets (meat-based: 32.3% ± 20.6%, plant-based: 8% ± 8%, F(1, 117) = 85.3, p < 0.001). Similarly, Morganella and Proteus showed significantly higher relative abundances on meat-based diets compared to plant-based diets (p < 0.05 and p < 0.001, respectively). Members of the genus Lactiplantibacillus and Levilactobacillus were highly abundant in CF and in other plant-based diets but interestingly in M (pure meat) as well. Klebsiella, one of the most prominent genera across all samples, did not show any significant association with plant or meat-based diets (p > 0.05). Similarly, members of the family Enterococcaceae (classified and unclassified) were present in larvae on all diets.
Bacterial community structure and larval performance
Spearman’s Rank correlation test was used to assess for any significant relationships between larval performance and gut bacterial taxa across the different diets as observed by Zhang et al. (2023) (Table 3). For this, bacterial community composition at D10 was used as this closely corresponded to the larvae’s harvesting age used for performance analyses (Zhang et al. (2023)). Due to the identified genetic influence on bacterial community composition, the two different genetic lines were analysed separately. Out of several parameters analysed, for the current study, biomass conversion ratio (BCR) of the larvae, prepupal weight (PW), protein and fat conversion rate, prepupal protein content and prepupal fat content were selected for the correlation analysis due to their relevance in industrial applications.
Table 3. Spearman’s Rank coefficient was calculated to explore the correlation between larval performance (biomass conversion ratio (BCR), prepupal weight (PW), protein and fat conversion rates, prepupal fat, and protein content) and gut bacterial taxa at the genus level on experimental day 10. A positive value indicates a positive correlation, while a negative value indicates a negative correlation. Significance of correlating parameters was determined using Student’s t-test followed by calculation of p-values (levels of significance are indicated by asterisk(s)). Only significant taxa (p < 0.05) are shown, with their corresponding Spearman’s Rank coefficients (R-values). Taxa that were present at less than 1% relative abundance in all samples were summarized and named “<1%”.
For both lines, several taxa were identified that correlated either positively or negatively with the various parameters of larval performance. In LD, the genus that showed the most and some of the strongest positive correlations with larval performance was Providencia. Providencia was positively correlated with BCR, PW, protein conversion, as well as fat conversion (Table 3). Members of the genus Proteus also showed significant positive correlations with BCR, PW, and fat conversion rate. The genus Bacillus was positively correlated with PW, while Klebsiella demonstrated a positive correlation with protein conversion, and Morganella was positively correlated with prepupal protein content. In contrast, Scrofimicrobium and Sphingobacterium exhibited negative correlations with most larval performance metrics, except for protein conversion rate (both taxa) and fat conversion rate (Sphingobacterium only). Enterococcus and Corynebacterium exhibited negative correlations with protein conversion rate (both taxa) and prepupal fat content (Corynebacterium only), while Paenibacillus and Lacrimispora correlated negatively with fat conversion rate. Members of the genus Dysgonomonas displayed significant negative correlations with BCR, PW, and fat conversion rate.
In WT, the genus that showed the most positive correlations with larval performance was also Providencia (Table 3). Providencia displayed positive correlations with all larval performance metrics, except for prepupal protein content. The genus Bacillus was again positively correlated with PW. Members of the genus Peptostreptococcus showed positive correlations with both BCR and protein conversion rate. On the contrary, Scrofimicrobium exhibited negative correlations with PW and fat conversion rate, while Sphingobacterium and Klebsiella were negatively correlated with PW. As opposed to the results from LD, in WT, Morganella and Proteus displayed negative correlations with several performance parameters. Corynebacterium was again negatively associated with prepupal fat content. Lactiplantibacillus and Paenibacillus were negatively correlated with BCR and prepupal protein, respectively.
Prevalence threshold determines the range of core bacterial taxa
Thus far, several studies have identified a core microbiota in BSFL. However, the methods used differed between studies. Authors opted to include either 1) prevalence data only (detection of taxa in a defined percentage threshold of samples), 2) abundance data only (relative abundance of taxa is set as the threshold across the samples thus uncommon and rare taxa that fall below this threshold will be omitted), or 3) a combination of both (Neu et al., 2021; Risely, 2020). To investigate the effect of methodology on the ability to detect an overall core across all samples in this data set as well as to investigate the possibility of lineage-specific core taxa notwithstanding age, and diet, the percent prevalence was calculated for each genus-level taxon across all samples (Figure 5A), and within each fly lineage (Figure 5B).
Figure 5. Analysis of shared genera across all samples (A), and across the two different genetic lines used in this study (B) notwithstanding experimental day (larval age) or diet. Dashed lines indicate the prevalence ranges. Only taxa that showed a prevalence of > 50% across all samples or in at least one line are shown. Taxa that showed a significantly different prevalence between the two lineages, but with one lineage fulfilling the prevalence criterion of >90% are marked as significant with asterisks as follows: *: p ≤ 0.05, **: p ≤ 0.01, ***: p ≤ 0.001.
The threshold plays a crucial role in data interpretation, as stricter thresholds naturally detect a lower number of shared “core” genera. While as many as 30 genera were shared across >50% of all samples, only two genera, namely Providencia_A_732258 and an unclassified genus within the family Enterococcaceae, were shared across 100% of all samples regardless of genetic lineage, diet, or age of the larvae and represented the core in this study. Moreover, Morganella and Enterococcus_H_360604, occurred in 100% of LD samples and >95% of WT samples. These may be argued to belong to the core microbiota as well. Relaxation of the prevalence threshold to >80% extended the core by a further nine genus-level taxa (Figure 5A). Furthermore, it was evident that certain bacterial genera showed significantly different prevalences depending on the genetic lineage of their host, suggesting the existence of lineage-specific core microbiota. At a prevalence threshold of >90%, Proteus, Scrofimicrobium, Corynebacterium, Vagococcus_B, and Lysinibacillus_304693 were significantly more prevalent in LD, while Paenibacillus_J_366884 was significantly more prevalent in WT (p ≤ 0.05; Student’s t-test; Figure 5B).