Method Validation
To validate our workflow, we tested the automated DNA extraction and the three PCR protocols in Table 1 on samples collected from four estuaries. Forty-eight samples were collected during the same period (May 2023) from four different locations within each estuary (12 from each site) to ensure that seasonal or time-related variables do not confound across-site comparisons.
We then reviewed the results to evaluate the steps that provide the greatest benefit in terms of number of fish species and fish Amplicon Sequence Variants (ASVs). For clarity and consistency, we use the term species to refer to both Operational Taxonomic Units (OUT
s) and instances where ASVs could not be matched to the species level but were assigned to a higher taxonomic level (e.g. the genus Fundulus).
Field collection
The samples used in this study were collected from a range of NERRS sites (see above) as part of a pilot project incorporating eDNA into a long-term multi-site monitoring network. All the samples were 1-liter water samples, collected from the surface, then transported to a nearby lab for filtering. Each sample was filtered through 1.5um porosity glass fiber filters and filters were replaced if clogging occurred. The filters were stored in pre-prepared tubes containing 4ml Longmire’s buffer (Longmire et al. 1997) to preserve the samples. Up to 3 filters from the sample were included in one tube.
Laboratory analyses and sequencing
All lab work was conducted in dedicated Biosafety Level 2 (BSL2) molecular laboratory facilities at the University of New Hampshire. All steps from sample extraction through sequencing were performed in separate, dedicated spaces within the building. We adhered strictly to standard laboratory cleaning practices for BSL2 including separation of Pre and Post-PCR laboratory spaces, physical separation of PCR Mix preparation, and DNA addition space. All experiments were conducted under laminar flow hoods and Biosafety cabinets for sample preparation, DNA extraction, purification and amplification. All surfaces were treated with 70% ETOH, 10% bleach and dH2O and were exposed to UV light before and after experimentation.
DNA extraction
Extraction Method Comparison.
Beads-based extraction methods have been successfully applied in many eDNA studies 22,23. We adapted this technique for use on a KingFisher Flex System (Thermo Fisher, Waltham, MA). We compared the Kingfisher protocol 18 to a Qiagen-based extraction which we have used in previous studies 24 to evaluate effectiveness.
KingFisher automated extraction: Sample vials containing filters in 400ml Longmire's buffer were received and entered into the lab tracking system. 10 μl of Proteinase K (0.02X) was mixed with 90 μl of Longmire buffer (LM) and added to each sample vial bringing the total volume to 500 ml. The mixture was incubated at 56°C for 90 minutes. For DNA isolation, 320 μl of Illumina SPRI paramagnetic beads (CAT: 20060057) were added to the lysed samples to ensure a 1:8 sample-to-beads ratio. DNA extraction was performed using The KingFisher Flex System (Thermo Fisher, Waltham, MA), which included two 80% ethanol washes, followed by elution in 100 μl of 10mM Tris-HCl. To see our detailed protocol visit protocol.io18.
Qiagen-based extraction: Filters were placed in a lyse and spin basket with 400 ml of buffer ATL and 20 μl of proteinase K and incubated at 56 ◦C for one hour, then centrifuged. The remainder of the filter extraction was performed on a QIAcube Connect system (QIAGEN®, Hilden, Germany) following the QIAamp DNA Mini protocol (Qiagen Cat. 51326). This kit is designed for extraction of DNA from tissue and is readily adapted to the Qiacube automated system. Although the reagents are proprietary the manufacturer notes that the method removes inhibition and contaminants.
Samples from both methods were eluted to 100 μl. DNA concentration was determined using the Qubit dsDNA HS Assay kit (Thermo Fisher, Waltham, MA), per the manufacturer's instructions. To monitor contamination, extraction blanks, consisting of H2O and plain buffer without filter, were extracted with each sample set. Once extracted, 50 μl of the sample was diluted 1:5 and stored in a -20C freezer for use in this project. The remaining sample was archived in a -80C freezer for potential future use.
Inhibition Removal. ZYMO OneStep-96 PCR Inhibitor Removal Kit (CAT: D6035) was used to remove inhibition from all samples following the manufacturer's instructions. Briefly, for 96 well plates, 150 µl of Prep Solution were added to each Silicon-A™-HRC Plate well, incubated for 5 minutes, and then centrifuged at 3,500 x g for 5 minutes. Subsequently, 50 µl of DNA was added to the prepared plate, mounted on a new Elution Plate, and centrifuged at 3,500 x g for 3 minutes. The filtered DNA was used for subsequent PCR reactions.
PCR Verification Amplification of target regions was verified on 2% E-Gel electrophoresis (Thermo Fisher, Waltham, MA, CAT: G820802). Absent or muted bands in samples with verified DNA concentrations indicate that the samples are likely to contain inhibiting agents. E-Gel images are also used as an initial screen for successful amplification of positive controls, and absence of contamination in negative controls.
PCR Amplification - Platinum. We found that the touchdown profile reduced off target amplification, but for many sites the target band was still effectively hidden by amplification of bacteria. To increase target amplification, we used Platinum SuperFi II polymerase. The Platinum enzyme was designed for higher fidelity and specificity and increased resistance to PCR inhibitors and to have a universal primer annealing temperature of 60°C which allows multiplexing PCR. However, we found that some samples amplified using the Platinum polymerase still exhibited inhibition. In these cases, adding an additional inhibition removal step improved amplification. Amplification Protocol: The PCR reaction was established in a total volume of 20 μl, incorporating 10 μl of Platinum SuperFi II Master Mix (2X), 2 μl of each equimolar primer mix (MiFish-U and MiFish-E, 5 uM each), 4 μl of genomic DNA, and 2 μl of PCR-grade H2O. Nextera adapters were added to each primer. The amplification followed a Touchdown (TD) PCR strategy to reduce non-specific amplification. The protocol began with a lid temperature set at 105°C, maintaining a reaction volume of 20 μl. The initial denaturation step was at 95°C for 3 minutes, followed by a cycling phase starting at 94°C for 30 seconds, an annealing step at 69.5°C decreasing by 1.5°C per cycle for 30 seconds, and an extension at 72°C for 1 minute and 30 seconds for 13 cycles. This was succeeded by an additional cycling phase: 94°C for 30 seconds, 60°C for 30 seconds, and 72°C for 45 seconds for 25 cycles, concluding with a final extension at 72°C for 10 minutes and a hold at 4°C indefinitely. A step-by-step protocol in available on protocols.io 18,19.
PCR replicates.
The value of PCR replicates on fish species detection, was tested with the optimized protocol incorporating Platinum and Zymo clean up. Samples from each of the four sites were tested. For each sample, three replicate PCRs were performed. These replicates were then sequenced as separate samples.
PCR Amplification KAPA. Our initial protocol used KAPA HiFi HotStart (KAPA) and a touchdown (TD) thermal profile modified from Pitz et al (2020). The cycling conditions were the same as described for Platinum above except that the annealing temperature of the second cycling phase was set to 55°C. The PCR reaction was established in a total volume of 12 μl, incorporating 6 μl of KAPA Master Mix (2X), 0.7 μl of the forward and 0.7 μl of the reverse MiFish-U primers, 2 μl of genomic DNA, and 2.6 μl of PCR-grade H2O.
Primers. The MiFish-U primer was designed to amplify bony fish species while the MiFish-E 3 targets elasmobranch (sharks and rays). Both types are important in estuarine systems, and multiplexing these primers expands the list of potential target species.
Table 2 shows the primer and adapter sequences used in this study.
Table 2. Primer & adapter sequences
MiFish-MIX-F (forward)
|
MiFish-U-F: 5′- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTCGGTAAAACTCGTGCCAGC-3′
|
MiFish-E-F: 5′- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTTGGTAAATCTCGTGCCAGC-3′
|
MiFish-MIX-R (reverse)
|
MiFish-U-R: 5′- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG CATAGTGGGGTATCTAATCCCAGTTTG-3′
|
MiFish-E-R: 5′- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-CATAGTGGGGTATCTAATCCTAGTTTG-3′
|
Library preparation and Sequencing
PCR products were submitted to the Hubbard Center for Genome Studies (HCGS) for library preparation and sequencing, on an Illumina Novaseq 6000 instrument. Briefly, the amplicon products from the 1st PCR were prepared by incorporating dual-index barcodes and Illumina sequencing adapters (P5 and P7) into the DNA fragments. The PCR amplification conditions were set as follows: initial denaturation at 94°C for 3 minutes, followed by 15 cycles of denaturation at 94°C for 20 seconds, annealing at 55°C for 30 seconds, and extension at 72°C for 15 seconds. A final extension phase was conducted at 72°C for 7 minutes to ensure complete amplification of target sequences. The amplified libraries were purified using BluePippin to selectively remove short fragments and primer dimers.
Sequence Processing and Taxonomic Assignment. Fastp v 0.23.0 was used to trim the poly-g tails and low-quality sequences (mean quality < 25, window size 2 bp from the raw, demultiplexed fastq files 25. The demultiplexed fastqs were then imported to qiime2 (qiime2-2022.8) for adapter trimming, denoising, and taxonomic classification 26. The cutadapt plugin in qiime2 was used to trim primer sequences and filter out read-pairs that do not contain the priming sequences 27. Read-pairs were retained if the MiFish primer sequences matched the allowed error rate of 0.1. Retained read-pairs were then denoised, merged, and checked for chimeras with the DADA2 qiime2 plugin 28. Forward and reverse reads were truncated to 110 and 105 base-pairs respectively and were required to have at least 12 base-pairs (qiime2 default) of overlap to merge the forward and reverse reads. To reduce the number of false positive unique ASVs, we used the ‘pseudo’ pooling method where samples are initially denoised independently, and then denoised a second time with prior knowledge of the ASVs that are found in at least two samples in the initial denoising step. We then classified the ASVs with the ‘classify-consensus-vsearch’ and the ‘sci-kit learn’ feature classifiers in qiime2 using the MitoFish database5. We ran feature classification at default parameters except we set the number of accepted reference sequences (percent-identity) to 90% (default is 80%), and the number of accepted sequences to ‘all’ instead of the top 10 (maxaccepts). The BLAST search results were retained for confirming classifications for each ASV.
Phylogenetic Analysis
Amplicon Sequence Variants (ASVs) identified from each treatment were aligned using the MAFFT software 29. The alignment was optimized under the E-INS-i algorithm using a gap open penalty of 1.5 and an offset value of 0.1, with the scoring matrix fine-tuned using a kappa value of 50. These settings were chosen to enhance the alignment accuracy of closely related fish ASVs. The aligned ASVs were used to reconstruct phylogenetic trees through the maximum likelihood method implemented in RAxML 30. The trees were generated under the General Time Reversible (GTR) model with a gamma distribution to accommodate rate variation across sites. This modeling choice is particularly effective for sequences exhibiting high variability, as is common with environmental DNA samples from diverse fish populations.
Statistical Analyses
A one-way Analysis of Variance (ANOVA) was performed separately for Fish OTUs and Fish ASVs using the `aov` function in R to determine if the observed differences using our optimized protocol were statistically significant from the other treatments. The null hypothesis for each ANOVA was that the means of Fish OTUs and Fish ASVs for all treatment groups were equal.
The primary objective of the statistical analysis was to evaluate the effects of different treatments on both Fish OTUs and Fish ASVs. The treatments included were KAPA.CTRL (control), KAPA.ZYMO (with Zymo clean-up), and Platinum.ZYMO (with Zymo clean-up). The p-values for the treatment comparisons were extracted from the ANOVA results. If the p-value was less than 0.05, indicating significant differences among the treatment groups, a post-hoc Tukey's Honest Significant Difference (HSD) test was conducted using the ‘TukeyHSD’ function in R to identify which specific treatment groups differed from each other. All statistical analyses were conducted using R software (version 4.3.2) with the ‘aov’ function for ANOVA and the ‘TukeyHSD’ function for post-hoc comparisons. To further validate our optimized protocol on data collected from stream waters in New England, we used a T-test to determine if there was a statistically significant difference between the control treatment "KAPA.CTRL" and the optimized protocol "Platinum.ZYMO".