Experimental design and tissue collection. Sixty-three weanling deer mice (Peromyscus maniculatus) were purchased from the Peromyscus Genetic Stock Center at the University of South Carolina (SM2 strain) and the National Institute of Health (NIH) Rocky Mountain Laboratory and housed in the University of New Mexico Animal Research Facility (Albuquerque). Deer mice are from wild-derived stocks that have been in captivity since ~ 1995 [46]. The mice were individually housed (n = 8–13 per treatment) for the entire duration of the experiment to avoid microbial sharing across individuals in 18 × 12 inch plastic cages at a temperature of ~ 23 (± 2) °C and a 12-h photoperiod. Cages were lined with wire mesh bottoms, where animals did not have access to feces that fell through the screen [50].
Mice were fed either a synthetic or semi-natural diet with systematically altered weight percentages of protein and carbohydrates (Table 1) for ~ 4 months (100–115 days). The only protein source in the synthetic diet was casein powder, while sucrose was the soluble carbohydrate source. In the semi-natural diet, cricket powder (www.budscricketpowder.com) was the primary protein source and a combination of cornmeal and cornstarch were the primary carbohydrate sources; the cornmeal used also contains ~ 6% protein by weight. For both the synthetic and semi-natural diets, we systematically altered the weight percent of protein (P) and carbohydrates (C) into three diet treatments including a 2.5P:57.5C (low protein treatment), 5P:55C (medium protein treatment), and 10P:50C (high protein treatment). Each diet was equally enriched with fortified salt (4%), vitamin mixture (1%), and fat in the form of corn oil (5%). The cricket powder used also contains ~ 5% fat by weight, but the total fat content was maintained at 5% across all semi-natural diets. All dry ingredients were homogenized with ~ 4 L of water and stored frozen (− 20°C). Each treatment group contained at least eight mice, and each mouse was fed 20–30g of food per day. Access to food and water was ad libitum, where food was replenished daily, and water was replenished as needed at least weekly. Age (~ 60±5 days) and gender-matched mice were randomly assigned to one of the six diet treatments.
After four months, the mice were euthanized via exposure to CO2, and tissues were immediately collected under sterile conditions for genetic and stable isotope analysis. Liver, biceps femoris muscle, and three intestinal (duodenum, ileum, and cecum) tissues were dissected and stored frozen at − 70°C. All animal handling and husbandry was conducted with the approval of the University of New Mexico Institutional Animal Care and Use Committee (21-201214-MC).
Table 1
Diet Macromolecular Composition. Weight percent proportions of protein (casein, cricket powder, or cornmeal), carbohydrates (sucrose, cornmeal, or cornstarch), and fat (corn oil) for the low, medium, and high-protein treatments in the synthetic and semi- natural diets. Each diet was equally enriched with a fortified salt (4%), vitamin mixture (1%), and proportion of fat (5%).
Diet | Treatment | Protein | Carbohydrates | Cellulose | Fat |
| | Casein | Sucrose | | Corn Oil |
Synthetic | Low Protein (2.5%P:57.5%C) | 0.025 | 0.575 | 0.300 | 0.050 |
Synthetic | Medium Protein (5%P:55%C) | 0.050 | 0.550 | 0.300 | 0.050 |
Synthetic | High Protein (10%P:50%C) | 0.100 | 0.500 | 0.300 | 0.050 |
| | Cricket Powder | Corn meal | Corn starch | Corn meal | | Corn Oil | Cricket |
Semi-Natural | Low Protein (2.5%P:57.5%C) | 0.020 | 0.005 | 0.508 | 0.067 | 0.237 | 0.043 | 0.007 |
Semi-Natural | Medium Protein (5%P:55%C) | 0.040 | 0.010 | 0.422 | 0.128 | 0.228 | 0.036 | 0.014 |
Semi-Natural | High Protein (10%P:50%C) | 0.080 | 0.020 | 0.230 | 0.270 | 0.237 | 0.021 | 0.029 |
DNA Extraction and 16S rRNA Amplicon Sequencing. DNA was extracted from intestinal samples (duodenum, ileum, and cecum) and mouse food using a variation of the cetyltrimethylammonium bromide (CTAB) method [11, 51, 52] which includes the addition of 0.1 mm silica beads (at 5% sample volume) for a 30-second mechanical bead beating step. Universal bacterial primers 515F 5'-GTGYCAGCMGCCGCGGTAA-3' [53] and 806R 5'-GGACTACHVGGGTWTCTAAT-3' [54] were used to amplify the V4 region of microbial small subunit ribosomal (rRNA) gene in triplicates. The resulting amplicons were cleaned and normalized with SequalPrep™ Normalization Plates (A1051001) and then purified with Agencourt AMPure XP magnetic beads (Beckman Coulter, Brea, CA, USA). Finally, sequencing libraries were prepared with the Nextera XT Index Kit v2 following manufacturer instructions and sequenced on the Illumina MiSeq platform (MiSeq Reagent Kit v3, 600-cycle, MS-102-3003) [55].
Bioinformatic Processing. R packages (version 4.3.0, Oksanen, 2016) including DADA2 (version 1.16, Callahan et al., 2016) and phyloseq (version 1.44.0, McMurdie & Holmes, 2013) were used to analyze resulting 16S rRNA sequencing reads. The resulting amplicon sequence variants (ASVs) were assigned through genus-level taxonomy using the SILVA v.138.1 database [59]. 29 ASVs were identified and removed as possible contaminants from the extraction and PCR negative controls using the decontam R package with its prevalence-based contaminant identification and default threshold of 0.1 (version 1.2.0, [60]). ASVs identified as chloroplasts and mitochondria were also removed. Of the 5254 ASVs initially identified, 4007 were retained. Samples with a sequencing depth of less than 2,000 reads were removed from the dataset. The median sequencing depth was 48,250 (range 2,039–182,105). At least seven individuals per diet treatment group were obtained for the resulting genetic analyses, remaining samples were discarded due to a low number of reads resulting in insufficient sample coverage. Samples were repeatedly rarefied at 8000 reads (n = 1000) which maintained coverage of above 98% and eliminated potential bias due to rarefaction [61–63]. Bray-Curtis dissimilarity measurements [64, 65] were used to assess differences in gut microbial communities among individuals by diet type (synthetic vs semi-natural), protein content (low, medium, and high), gut sample section (duodenum, ileum, cecum), as well as maternal and sex effects. Statistical differences were tested together using adonis (PERMANOVA) models, with 10,000 permutations and reported F, R2, and p values, to assess significant differences among groups. Lastly, ASV counts were transformed to relative abundances and low-abundance taxa (phyla with less than 1% relative abundance) were removed for community composition analyses.
Quantification of Bacterial Biomass. The quantification of total gut bacteria was performed by quantitative PCR (qPCR), which measured the copy number of the 16S rRNA gene as a proxy for bacterial cell number [66, 67]. The V3-V4 region of the bacterial and archaeal 16S rRNA gene was targeted using universal prokaryotic primers [68] Pro341F (5’-CCT ACG GGN BGC ASC AG-3’) and Pro805R (5’-GAC TAC NVG GGT ATC TAA TCC-3’). The qPCR assay was performed in duplicate for four replicates of each diet type per intestinal location (72 samples total) as previously described [67]. The total bacterial 16S rRNA gene copies were calculated based on the standard curve and were normalized to the starting sample mass to reduce the effect of gut size [69]. To adjust for multiple 16S rRNA copies [70], our resulting 16S rRNA amplicon sequence variants (ASVs) were merged as paired-end reads and evaluated using the rRNA operon copy number database rrnDB Estimate [71], which produced an abundance table adjusted by the average 16S copy number of the highest taxonomic rank (Reynebeau and Takacs-Vesbach 2024 in prep). Lastly, the proportion of ASVs with multiple 16S rRNA gene copies was quantified and adjusted accordingly (Reynebeau and Takacs-Vesbach 2024 in prep).
Amino Acid d 13 C Stable Isotope Analysis. Mouse tissues were lipid extracted with three 24 hour soaks in a 2:1 chloroform:methanol solution, rinsed with deionized water, and freeze dried for 24–48 hours. Approximately 5 mg of dried mouse tissues, 2–3 mg dietary casein, 20–30 mg cornmeal, and 2–3 mg cricket powder was hydrolized and derivatized to N-trifluoroacetic acid isopropyl esters [72] alongside an in-house amino acid reference material containing a mixture of commercially available AA (Besser et al., 2022). The δ13C values of 13 AA, including six considered essential for mice (Valine (Val), Leucine (Leu), Isoleucine (Ile), Phenylalanine (Phe), Lysine (Lys), and Threonine (Thr)), were measured on a Thermo Fisher Scientific Trace 1310 Gas Chromatograph (GC, Bremen, Germany) outfitted with a 60 m × 0.32 mm ID BPX5 × 1.0 µm column (SGE Analytical Science), then converted to CO2 via a Thermo Scientific GC Isolink II (Bremen, Germany) combustion interface coupled to a Delta V Plus isotope ratio mass spectrometer (Bremen, Germany) at the University of New Mexico Center for Stable Isotopes. Isotope values are expressed in delta (δ) notation and reported in parts per thousand or per mil (‰): δ = [(Rsample−Rreference)/Rreference], where R = 13C/12C; the internationally accepted reference standards are Vienna Pee Dee Belemnite (VPDB). Each sample was injected in duplicate where mean amino acid δ13C values were calculated across injections and corrected to account for the carbon added during the derivatization process. In-house amino acid reference material was analyzed at least every two samples or four injections [10, 11, 15, 32]. Mean within-run standard deviation (SD) of the in-house AA reference material ranged from 0.3‰ (Val, Leu, Ile) to 0.5‰ (Tyr) for δ13C (SI Table 1). See SI table 2 for the measured AAESS d13C values of mice muscle, dietary casein, dietary cricket powder, and dietary cornmeal.
Amino Acid d 13 C Statistical Analysis. Statistical analyses were performed on all measured AAESS (Val, Leu, Ile, Phe, Lys, and Thr). First, Shapiro–Wilk and Levene's tests in R packages stats and car were used to test for normality and homogeneity of variance [73]. Next, differences in AAESS δ13C values of mice muscle and their potential amino acid sources (dietary or microbial) were evaluated using Kruskal–Wallis and pairwise Wilcoxon rank-sum tests in R package stats.
We measured AAESS d13C values of all potential AA dietary sources for each diet type, casein in synthetic and cricket powder and cornmeal in semi-natural. AAESS d13C values of gut microbes synthesized from dietary carbohydrates (cornmeal and sucrose) (SI table 3) were estimated as previously described [11]. Briefly, AAESS specific fractionation factors (D13C) from Larsen et al., 2009 for Actinomycetota and Rhodococcus grown on a medium with sucrose as the sole carbon source were subtracted from the mean (± SD) d13C value (-12.0 ± 0.5‰) of dietary carbohydrates (cornmeal and sucrose) [74]. Larsen et al., 2009 did not report discrimination factors for Lys so we used a D13C value 0.0‰ reported by Abelson & Hoering (1961) [75]. True digestibility of individual AAESS within each protein source were accounted for (SI table 4). AA specific digestibility for cricket powder are from Malla et al., 2022 who measured the standardized ileal digestibility of house crickets (A. domesticus) fed to growing pigs [76]. AA specific digestibility data for casein and cornmeal protein are from Keith & Bell (1988); data for Lys in cornmeal was not reported so we assumed a digestibility of 72%, which is the mean digestibility of the other AAESS [77].
To statistically quantify the proportional contribution of AAESS sources (dietary vs microbial) to mice skeletal muscle we ran a Bayesian mixing model at default parameters using R package MixSIAR [78]. Mixing model input included measured d13C values of AAESS in dietary protein sources (synthetic: casein; semi-natural: cricket powder and cornmeal), estimated d13C values of AAESS synthesized by gut microbes using dietary carbohydrates as a carbon source, and measured d13C values of AAESS in mouse muscle. Since mice are unable to synthesize AAESS, we assumed direct routing of these compounds and applied trophic discrimination factors of zero for each AAESS in the mixing model [32, 33]. This model approach provides a minimum estimate of the microbial AAESS contribution to host muscle because it only includes microbial AAESS synthesis from carbohydrates, however, gut microbes could also be synthesizing AAESS from other compounds (e.g., non-essential AA) available in the diet.
Relative Essential Amino Acid Supply and Demand Calculations. To estimate AAESS demand as a percent of dry diet for mice, we used the standard AA requirements for laboratory mice (Mus Musculus) reported by the National Research Council [79, 80]. Since Mus Musculus have an overall protein requirement of 20%, and Peromyscus have a preference for ~ 15% protein diets [81–83], we used a conversion factor of 0.75 (15/20 = 0.75) to convert the demand for Mus musculus to that of Peromyscus. To estimate relative AAESS supply in the synthetic diet treatments, we used AA concentration data for casein [84] to calculate the relative proportions of each AA available in the low, medium, and high protein diet treatments (SI table 5). To estimate relative AAESS supply in the semi-natural diet treatments we used AA concentration data for cornmeal [85] and cricket powder (budscricketpowder.com) to calculate the relative proportions of each AA available in the low, medium, and high protein treatments (SI table 5).
Shotgun Metagenomic sequencing. A subset of extracted DNA from three cecum samples from each diet treatment (18 total) were selected and sent to Diversigen (New Brighton, MN, USA) for shotgun metagenomic sequencing using BoosterShot. Only samples containing 16S rRNA gene sequencing reads of greater than 60,000 and samples with at least 200ng/µL of genomic DNA were submitted for analyses. Sequencing libraries were prepared with a proprietary procedure adapted from the Illumina Nextera XT kit (Illumina, 15032355). Libraries were sequenced at Diversigen (New Brighton, MN, USA) on an Illumina NovaSeq 6000 using paired-end 2X150 reads to a minimum of 7 million (7M) reads per sample. The raw reads were quality controlled using MetaWRAP-Read_qc module from the MetaWRAP pipeline [86]. Briefly, default module settings were applied to trim raw sequence reads (to a minimum quality score of Quality Score > 20) and remove sequencing adapters. Human contamination was removed using the indexed human genome (hg38, NIH RefSeq GCF_000001405.40) via bmtagger. Next, we indexed the host genome, Peromyscus maniculatis (HU_Pman_2.1.3, NCBI RefSeq GCF_003704035.1) and re-ran the MetaWRAP-Read_qc module to remove host contamination from all samples [86]. The resulting cleaned reads (Quality Score > 30 for all samples) were assembled with the metaWRAP-Assembly module using metaSPAdes using default parameters and k-mer sizes 21, 33, 55 [87]. The assembly was binned into metagenomic-assembled genomes (MAGs) with the metaWRAP-Binning module using three metagenomic binning algorithms at their default settings—MaxBin2 v.2, metaBAT2 v.2, and CONCOCT v1.10 [88–90]. The metaWRAPBin_refinement module was used to consolidate multiple binning predictions into a new, improved bin set using the bin refinement software CHECKM v.1.1.3 (completion > 50%, contamination < 10%) [91]. MAGs were of medium-quality draft according to the Genomic Standards Consortium, where 49 bins were identified [92]. To calculate the abundance of each bin standardized to each individual sample size, we used the metaWRAP-Quant_bins module which uses Salmon [93] to calculate abundances expressed as genome copies per million reads. The assembled MAGs were taxonomically classified using the bin annotation tool (BAT) [94]. Briefly, BAT queries open reading frames against the full NCBI non-redundant reference database (nr) [95] using DIAMOND [96] where taxonomic classifications are made at low taxonomic ranks if closely related organisms are present in the reference database resulting in high classification precision even for sequences from considerably unknown organisms [94]. Prodigal (version 2.6.3) was used to translate DNA from each bin to protein [97].
Resulting proteomes were uploaded to GapMind, a tool for reconstructing and annotating AA biosynthesis pathways in prokaryotic genomes [98]. GapMind uses MetaCyc database of metabolic pathways and enzymes [99] to identify the most plausible pathway for making each AA based on experimentally characterized proteins. Each pathway is broken down into a list of steps described by enzyme commission (EC) numbers or terms based on experimentally characterized protein sequences from Swiss-Prot [100] and protein families from TIGRFam [101] or Pfam [102]. GapMind identifies each candidate pathway step by searching a genome of interest for similar proteins using ublast [103] as well as members of families using HMMER [104]. Next, each candidate pathway step identified is given a confidence score of high-, medium-, or low-. High-confidence candidates must align at 40% identity and 80% coverage to a characterized protein (excluding similar proteins with other described functions). Medium-confidence candidates either must align at 30% identity with 80% coverage (includes similar proteins with other functions), or 40% identity with 70% coverage (sufficient if similar to proteins with other functions) or be similar to a curated protein from Swiss-Prot. Low-confidence candidates must align at 30% identity and 50% coverage (includes similar proteins with other functions). Taken together, GapMind finds the highest-confidence pathway for synthesizing each AA present in the genome of interest [98].
To visualize which microbial groups are capable of synthesizing AAESS, Sankey diagrams [105] were generated by quantifying bacterial taxonomic groups with their AAESS biosynthesis genes exported from Gapmind [98]. Only Gapmind AA biosynthesis pathway steps with medium to high confidence scores [106] were evaluated. Additionally, only AAESS compatible with AA stable isotope analysis (Val, Leu, Ile, Phe, Lys, and Thr) were evaluated.