Phage propagation and plaque assays
3 phages from different families, namely phage T4 (Myoviridae), phage c2 (Siphoviridae), and phage Phi X174 (Microviridae) (Table S1) were produced in this study. The host of phage T4, Escherichia coli DSM 613, was grown in LB broth (Merck, Kenilworth, NJ, USA) at 37°C with shaking at 225 rpm. Phage c2’s host, Lactococcus lactis MG1363, was grown in M17 broth (Merck, Kenilworth, NJ, USA) supplemented with 5 mM CaCl2 at 30°C without shaking. For the host of Phi X174, Escherichia coli ATCC 13706 was grown in BHI broth (Merck, Kenilworth, NJ, USA) containing 10 mM CaCl2 and MgCl2 at 37°C and shaking at 225 rpm. For phage propagation, phages were incubated with their respective host bacteria overnight and then subject to centrifugation at 5000× g for 30 min at 4°C to remove cell debris. The phage stocks were prepared by the 0.45 µm filter and stored at 4°C. Phages’ infectivities in the filtrates were enumerated by plaque assay as described below.
Preparation of phage spiked fecal samples
A fresh fecal sample was donated by an anonymous healthy adult donor and mixed thoroughly with 5 different preservation buffers, namely StayRNA (A&A Biotechnology, Gdynia, Poland), CANVAX (Canvax Biotech, Córdoba, Spain), DNA/RNA Shield (Zymo Research, Irvine, CA, USA), RNAlater (Sigma-Aldrich) and SM (Sodium chloride/Magnesium sulfate) buffer (lab preparation, 200 mM NaCl, 10 mM MgSO4, 50 mM Tris-HCl (1 M, pH 7.5)). Then the fecal suspensions were spiked with different volumes of phages to a final concentration of 3 × 105 plaque-forming units per milliliter (3 × 105 PFU/mL) of each phage and mixed gently to form the spiked fecal samples.
Storage at different temperatures and times
The above prepared spiked fecal samples were stored at different temperatures (room temperature at 25°C, refrigeration at 4°C, freezing at −20°C and −80°C) and the infectivity of phages was detected at different time points (0, 1, 7, 14, 30, and 100 days) by plaque assays. Briefly, 100 μL of the spiked fecal sample was centrifuged and at 12,000 rpm for 10 mins and 10 μL of the supernatant with different dilutions containing the phages (c2, Phi X174, and T4) was mixed with 200 μL of their respective overnight cultured hosts LactococcuslLactis MG1363, Escherichia coli ATTC 13706B1 and Escherichia coli DSM 613 and left to settle for 10 min at room temperature. 5 mL of media containing 0.5% agarose pre-warmed at 40°C was mixed with the phage sample and bacterial culture and poured to the top of a pre-warmed agar plate (1.5%). The double-layer plates were first solidified at room temperature and then incubated overnight at the corresponding growth temperature of the bacterial host. On the next day, the phage plaques were counted, and PFU/mL was calculated. The infectivity of the phage was calculated as the following formula:
Phage infectivity (%) = Plaque amounts/Dilution factors/Added volume of diluted phage/The initial concentration of phage×100. The genome recovery rate of the spiked phages was determined by qPCR at 0, 1 and 100 days. Besides, we also simulated a field sampling condition that fecal samples cannot be stored under refrigeration or freezing condition immediately. Herein, we kept the spiked fecal sample at 25°C for 2 days and then transferred them to 4°C and −80°C for 14 days for the infectivity and genome recovery study, respectively (Fig. 1). We also stored the same batch of spiked fecal samples for the virome diversity study, where a baseline sample was prepared at day-0, and samples stored at 4°C and −80°C for 14-day, −80°C for 500-day were prepared for metavirome sequencing.
Purification and pretreatment of virome DNA
Phage isolation and purification were carried out according to our previous method with minor modifications [33]. Briefly, the Centriprep 50K was replaced by Centrisart® I centrifugal ultrafiltration unit (MWCO 100 kDa, Sartorius Stedim Biotech GmbH) and the up-concentration step was done at 2500× g for 30 min at 4°C or 25°C. Extra centrifugation times were applied for some difficult samples (Table S4). QIAmp viral RNA mini kit (Qiagen, Hilden, Germany) was used for the extraction of viral DNA/RNA from the concentrated virome solution. And then the extracted nucleic acids were amplified by Multiple Displacement Amplification (MDA) with the Genomephi V3 kit (GE Healthcare Life Science, Marlborough, MA, USA), and the amplification time was done at 30°C for 30 min. Finally, the amplified DNA was cleaned with a Genomic DNA Clean & ConcentratorTM kit (Zymo Research, Irvine, CA, USA).
Phage quantification by quantitative real-time PCR (qPCR)
The DNA from phage T4, c2, and Phi X174 was quantified by real-time qPCR using SYBR Green Master Mix (Roche, Basel, Switzerland) on CFX96 Touch Real-Time PCR Detection System (Bio-Rad, Hercules, California, USA). 10 μM of forward and reverse primers targeting the specific T4, c2 and Phi X174 genome were added to 20 μL reactions, which were run using the following setup: initial stage at 50°C for 2 min, hot start at 95°C for 10 min, followed by 40 cycles of (i) 95°C for 15 s, (ii) 55°C for 20 s, and (iii) 60°C for 40 s [33]. Serial 10-time dilutions of phages (T4, c2, and Phi X174) genomic DNA were used to generate standard curves. After the qPCR amplification, a melting curve analysis (95°C for 15 s, 60°C for 60 s, 95°C for 30 s and 60°C for 15 s) was performed. Each reaction was performed in duplicates and the designed primers for specific phages and targeted positions were listed in Table S2.
To investigate the activity of universal nuclease in the buffer of RNAlater, CANVAX and SM buffer, the exogenous DNA was spiked into the above buffers and carried out the same extraction procedure. And also the fecal samples kept in the above buffers was up-concentrated and washing with SM buffer up to 5 times to verify if washing can improve fecal virome quality.
Metavirome sequencing and data pre-processing
The concentration of the MDA amplified and cleaned DNA was measured by Qubit dsDNA HS Assay Kit (ThermoFisher Scientific, Waltham, MA, USA). The library was constructed using the Nextera XT kit (Illumina, San Diego, CA, USA) and purified by AMPure XP beads according to the manufacturer’s protocol. Constructed libraries were sequenced using 2×150 bp paired-end settings on an Illumina NextSeq550 platform.
The average sequencing depth for the metavirome was 3,646,735 reads/sample (Table S6, min. 661,636 reads and max. 6,529,708 reads). The raw reads were trimmed from adaptors and barcodes and the high quality sequences (>95% quality) using Trimmomatic v0.35 [40], with a minimum size of 50nt were retained for further analysis. High quality reads were de-replicated and checked for the presence of Phi X174 using BBMap (bbduk.sh) [41]. Virus-like particle-derived DNA sequences were subjected to within-sample de-novo assembly-only using Spades v3.13.1 [42]. and the contigs with a minimum length of 2,200 nt, were retained. Contigs generated from all samples were pooled and de-replicated at 90% identity using BBMap (dedupe.sh) [41] . Prediction of viral contigs/genomes was carried out using VirSorter2 [43] (“full” categories | dsDNAphage, ssDNA, RNA, Lavidaviridae, nucleocytoplasmic large DNA viruses (NCLDV) | viralquality ≥ 0.66), vibrant [44] (High-quality | Complete), and checkv [45] (High-quality | Complete). Taxonomy was inferred by blasting viral ORF against viral orthologous groups [46] and the Lowest Common Ancestor (LCA) for every contig was estimated based on a minimum e-value of 10e-5. Following assembly, quality control, and annotations, reads from all samples were mapped against the viral (high-quality) contigs (vOTUs) using the bowtie2 [47] and a contingency-table of reads per Kbp of contig sequence per million reads sample (RPKM) was generated, here defined as vOTU-table. Code describing this pipeline can be access in github: github.com/jcame/virome_analysis-FOOD.
Data analysis
The infectivity and genome recovery were visualized by GraphPad Prism (v8.0.1) or R software (v4.1.2). Analysis of viral community α- and β-diversity were performed using packages Phyloseq (v1. 36.0) [48] and Vegan (v2.5.6) in R. For α-diversity analysis, all the indexes were calculated with t.test using packages ggsignif (v0.6.3). Bray-Curtis distance metrics were calculated for β-diversity analysis and unconstrained ordination was performed using principal coordinate analysis (PCoA). R package gggenomes was used to visualize the functional genes of annotated viral contigs [49].