An Optimized eDNA Protocol for Fish Tracking in Estuarine Environments

doi:10.21203/rs.3.rs-4740372/v1

Download PDF

Article

An Optimized eDNA Protocol for Fish Tracking in Estuarine Environments

https://doi.org/10.21203/rs.3.rs-4740372/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Environmental DNA (eDNA) is revolutionizing how we investigate biodiversity in aquatic and terrestrial environments. It is increasingly used for detecting rare and invasive species, assessing biodiversity loss and monitoring fish communities, as it is considered a cost-effective and noninvasive approach. Some environments, however, can be challenging for eDNA analyses. Estuarine systems are highly productive, complex environments, but samples collected from these settings may exhibit PCR inhibition and a low fish read recovery. Here we present an approach for detecting fish in turbid, highly productive estuarine systems. The workflow includes bead-based extraction, inhibition removal, high fidelity and specificity DNA polymerase (Platinum SuperFi II) and multiplexing the universal MiFish primers. Applying this hybrid method to a variety of complex estuarine samples with known inhibition we have more than doubled the number of recovered fish species while removing most of the off-target amplification.

Biological sciences/Biotechnology

Biological sciences/Ecology

Biological sciences/Molecular biology

Earth and environmental sciences/Ecology

Earth and environmental sciences/Environmental sciences

eDNA

MiFish

Platinum SuperFi II

PCR Inhibition

Estuarine

fish species

Multiplex PCR

The application of environmental DNA (eDNA) metabarcoding has significantly reshaped our approach for evaluating biodiversity in aquatic and terrestrial ecosystems. In aquatic systems, this technique has become a useful tool for assessing and monitoring fish communities^1,2. The method involves PCR amplification of DNA extracted from water samples using primers designed to specifically target universal gene markers, like mitochondrial or nuclear ribosomal genes, to identify a wide array of species. Miya et al.³ made a significant contribution by developing universal PCR primers for broad-range fish eDNA metabarcoding. This approach supports the detection of diverse fish species from environmental samples. Kawato et al.⁴ built on this by introducing a method for detecting deep-sea fish species by multiplexing a set of MiFish primers, originally developed by Miya et al., (MiFish-U and MiFish-E), using a new high-fidelity enzyme. These primers, optimized for enhanced specificity and coverage, demonstrate the potential for precise eDNA analysis in deep aquatic ecosystems. While this primer set is widely used and is complemented by a curated MitoFish database⁵, co-amplification of non-target reads, particularly bacteria, can be problematic. Various solutions, including using a touchdown PCR protocol⁶, and primer design modifications e.g.⁷ have been applied with varying degrees of success.

Despite these efforts, PCR in environmental samples from turbid environments remains challenging^8–10. Turbidity refers to suspended particles that reduce light transmissivity in water⁸. Particles may be from inorganic sediments or organic material from plankton or other biomass suspended in the water column. These particles can complicate eDNA analyses in several ways: filter clogging can reduce the amount of water filtered or increase the number of filters to be processed, and high organic material may contain inhibitory compounds¹¹. In areas with high biological productivity, fish DNA may comprise a relatively small portion of total DNA recovered from a sample. In addition to low target eDNA concentrations, these samples often exhibit PCR inhibition, where the presence of humic acids, and other organic and inorganic compounds can significantly reduce the efficiency and accuracy of the PCR⁸.

In response to these challenges, we have developed an eDNA workflow with two main protocols, optimized for detecting fish in turbid estuarine samples. We evaluated each step in these protocols to identify a workflow that is adaptable to robotic platforms, cost-efficient, and optimizes the number of fish species and the number of generated fish ASVs. Our final selected workflow included an automated KingFisher DNA extraction and an optimized PCR protocol that involves a combination of the Zymo OneStep™ PCR Inhibitor Removal Kit, the Platinum SuperFi II, and a touchdown program for Multiplexing Mifish primers.

The protocols in this workflow were tested on 48 samples collected from four estuaries in the United States’ National Estuarine Research Reserve System (NERRS); San Francisco Bay CA, Mission-Aransas TX, Apalachicola FL and Jacques Cousteau NJ. This geographical diversity in site selection was intended to ensure a broad spectrum of environmental conditions, sample variability and species diversity. Initially we tested a variety of published PCR protocols as well as several polymerases and inhibitors on samples from these sites. We then selected the most promising alternatives for more detailed testing. The different PCR workflows we evaluated are summarized in Table 1 below. All the PCR comparisons were performed on samples extracted using the automated Kingfisher extraction method.

Table 1

PCR workflows evaluated in this study.
Protocol	Inhibition removal	DNA Dilution	PCR Touchdown	Primers
KAPA-CTRL	None	1:5	2nd step: 55C	MifFish-U
KAPA-ZYMO	Zymo	1:5	2nd step: 55C	MiFish-U
Platinum-ZYMO	Zymo	1:5	2nd step: 60C	MiFish-U MiFish-E

This approach builds on the foundations laid by Kawato and Miya^3,4, and others to incorporate automated DNA extraction, and advanced PCR design to significantly reduce off-target amplification, which consists mainly of bacterial sequences^4,12, and enhance PCR efficiency and reliability in the presence of inhibitors. The overall goal of this work is to improve eDNA-based fish species detection from problematic turbid samples collected in estuarine environments.

DNA Extraction

As interest in molecular monitoring increases, the use of robotic systems in eDNA extraction and handling is becoming more common, especially for processing larger numbers of samples. Robotic systems offer advantages such as increased efficiency and consistency, though they also have potential disadvantages, including the risk of cross-contamination and higher initial costs. Our goal was to validate an automated bead-based extraction method with comparable performance to commonly used column-based methods like QIAGEN kit-based extractions.

Samples were extracted with the automated Kingfisher protocol, using magnetic beads, and DNA concentrations ranged between 1.84 ng/μl and 25.8 ng/μl (Fig. S1, Supplementary Information). As expected, we observed variability among the sites with Mission-Aransas displaying the highest yields (Fig. S1). Variability within each site reflects conditions at each of the four sampling locations (Table S1, Supplementary information). We did not observe any cross-contamination effects, and we found no significant differences between DNA concentrations from samples obtained with the bead-based Kingfisher extraction and the QIAGEN protocol (p-value = 0.7; Fig. S2, Supplementary Information). This suggests that our automated DNA extraction using magnetic beads is equally effective under various conditions.

PCR Inhibition Removal and Optimization

Initial evaluations of PCR amplification were conducted through visual inspection of E-Gels. The absence of amplification in samples, where target species were expected, indicates the presence of inhibitory compounds as seen in samples from San Francisco Bay⁸ (Fig.1a. KAPA-CTRL). When an inhibition removal step was introduced to the same protocol using Zymo, amplification was observed, but primarily of off-target sequences, as evidenced by bands appearing higher than the expected ~400 bp with adapters (Fig.1b. KAPA-ZYMO). Further improvements were achieved by utilizing Platinum SuperFi II polymerase (Fig.1c. Platinum-ZYMO), which resulted in amplification within the expected size range for fish (~300bp with adapters). This underscores the effectiveness of a tailored polymerase in enhancing target-specific amplification while reducing off-target effects. The improved specificity can be attributed to the hot-start mechanism of the Platinum SuperFi II polymerase, which activates the enzyme only at high temperatures, thereby preventing the extension of misprimed targets and primer-dimers. Additionally, the high processivity and fidelity of this polymerase help minimize the occurrence of nonspecific products. Our experience illustrates that while adding an inhibition removal step incurs additional costs, it substantially enhances species detection by mitigating the impact of PCR inhibitors present in estuarine samples.

Balancing Amplification Fidelity and Costs

Inhibition is prevalent across many estuarine samples processed in our lab, often muting or completely preventing PCR amplification. Laboratory managers might opt to apply inhibition removal selectively based on initial PCR results or incorporate it routinely into the workflow to consistently improve outcomes. Additionally, challenges with off-target amplification are common, particularly when using the KAPA protocol post-inhibitor removal with Zymo (Fig.1b). The selective application of advanced polymerases can significantly enhance the fidelity of amplification (Fig.1c), focusing sequencing efforts more efficiently on target species and reducing interference from non-target sequences. There is a risk however, that an overly specific polymerase will result in non-detection of some target species.

Optimizing Fish Detection

To enhance the accuracy and efficiency of fish species detection in estuarine environments, we implemented several methodological adjustments (see Methods). When developing this study, we selected sites known from our previous eDNA analyses to be problematic due to turbid water and common inhibitors (Table S2, Supplementary Information). Even relatively small increases in the number of species detected through eDNA significantly improve the method's utility in these challenging conditions. Initially, KAPA.CTRL, our standard lab protocol used without Zymo cleanup, established a baseline with a total of 59 species detected across all sites. Upon introducing an inhibition removal step with Zymo (KAPA-ZYMO), we observed an increase in the mean number of fish species from 3 to 5 per sample, culminating in a total of 78 species. This shift highlights a significant enhancement, illustrating the effectiveness of Zymo cleanup in our settings. Further trials with Platinum SuperFi II were initially conducted without Zymo cleanup. However, due to poor amplification performance illustrating its high sensitivity to inhibitors, these trials did not progress to sequencing (data not shown). This necessitated the integration of Zymo cleanup steps in subsequent experiments. With the combined adjustments of Zymo cleanup and a refined PCR touchdown program using Platinum SuperFi II (Platinum-ZYMO), the mean number of fish species elevated to 7 per sample, with a total of 98 species detected across all sites—an overall increase of approximately 25.64% from the KAPA-ZYMO setup (Fig. 2a).

The impact of inhibition removal on ASV numbers was minimal (mean of 13 ASVs for KAPA-CTRL and 14 with KAPA-ZYMO); however, the comprehensive integration of Zymo with the Platinum protocol markedly increased the mean number of ASVs per sample to 46 (Fig. 2b).

When considering individual sites, each of the four locations showed an increase in fish detection, regardless of different environmental conditions in each site (Fig. 3). Specifically, the Jacques Cousteau (JC) site demonstrated a significant rise in the number of species from 29 with KAPA.ZYMO to 42 with Plati.ZYMO, an increase of approximately 44.83%. Similarly, the number of unique species at Padilla Bay (PB) increased from 17 to 26, marking a 52.94% rise and Mission-Aransas (MA) experienced a 34.78% increase, with species counts rising from 23 to 31. The San Francisco (SF) site saw a more modest increase of 25.00%, moving from 20 to 25 species. These enhancements across can be attributed to the combined efficacy of Zymo cleanup and Platinum amplification, integrated with a modified touchdown PCR program, which collectively improved the detection of fish across these diverse estuarine environments.

The Mish-E primer was added in response to feedback from resource managers that elasmobranchs (rays and sharks) were not being detected in areas where they were known to be present. Although we did detect skate (family Rajidae) with the MiFish-U alone, the combined primers detected four additional elasmobranchii belonging to three different families: Dasyatidae (Dasyatidae spp., Hypanus sabinus), Rajidae (Rajidae spp.), and Rhinopteridae (Rhinoptera spp.).

Polymerase Optimization and Enhanced Read Counts

The Platinum.ZYMO method shows higher amplification of fish reads as off-target amplification is reduced and there is greater sequencing capacity to amplify the target species. Fig. 4a shows a strong correlation between raw reads and fish reads using the Platinum-ZYMO approach (β = 0.615, p < 0.0001, R² = 0.95). The correlation is less apparent, although still significant (β = 0.0000451, p = 0.00844, R² of 0.42) for ASVs (Fig. 4b).

Reducing off target amplification and increasing fish reads has several positive implications for resource management; low abundance sequences are more likely to be amplified, which supports identification of rare species and increases detection sensitivity. Read counts for individual species are higher, allowing greater confidence in detections. Higher numbers of more diverse ASVs also raises the possibility of finer scale resolution of genetic diversity within species. A Maximum Likelihood trees of unique fish ASVs detected in samples from San Francisco Bay estuary, suggests the highest diversity of unique ASVs was identified with the Platinum-Zymo method (Fig. 5).

Challenges of ASV Interpretation

The interpretation of amplicon sequence variants (ASVs) presents unique challenges, particularly in distinguishing true biological signals from technical noise. The abundance of reads can reflect the nuances of extraction, PCR, and sequencing processes rather than actual biological abundance. Rare ASVs, which may appear only once across datasets, can represent genuinely rare biological entities, heteroplasmy¹³, or PCR artifacts such as errors or stochastic variations in sample processing.

In our analysis, the high sensitivity of the Platinum SuperFi II protocol combined with Zymo cleanup (PLATINUM-ZYMO) enabled the detection of a broader array of fish ASVs, capturing true biodiversity more effectively (Fig. 5). For example, in San Francisco Bay, which was the most refractory site due to increased PCR inhibition, KAPA-ZYMO detected 40 ASVs assigned to 20 species, whereas PLATINUM-ZYMO yielded 284 unique ASVs assigned to 25 species (Fig. 5, ML Tree). Applying a cutoff threshold of 1% of fish reads to remove rare ASVs resulted in 26 ASVs for KAPA-ZYMO assigned to 19 species and 25 ASVs for PLATINUM-ZYMO assigned to 20 species. The 1% cutoff, however, represented only 40 reads for KAPA-ZYMO but 3,010 reads for PLATINUM-ZYMO.

By setting a minimum cutoff of 50 reads per ASV for PLATINUM-ZYMO, we successfully recovered all 25 fish species and the majority of ASVs (223). This demonstrates the method's robustness in detecting species while managing potential artifacts. Denoising methods like DADA2, which we employed, enhance the reliability of detected ASVs by removing errors and chimeras, thereby reflecting true biological variation.

Rare ASVs can either be single ASVs for a species (rare taxa) or secondary ASVs that show natural variation or heteroplasmy (rare haplotypes). While they contribute to species richness, careful interpretation is necessary to avoid distorting community composition analyses. Bioinformatic processing using denoising methods such as DADA2, along with clustering, distance-based methods, and establishing appropriate cutoff thresholds, are critical steps in validating the authenticity of rare ASVs.

PCR Replicates

Replicate samples may be collected or generated at any point in the eDNA workflow, from field sampling to sequencing replicates. Multiple studies have highlighted the benefits of replicates in enhancing the reliability of biological data while also recognizing that additional samples increase the cost of analyses ^14–16. Balancing the value of the additional information against budget and time constraints is crucial for researchers and managers. In this protocol, we recommend conducting triplicate PCRs for each sample, which are then pooled back into one sample for sequencing. Our testing of PCR replicates (see Methods section) in this sample set confirms earlier findings ¹⁷: increasing PCR replicates substantially improves the detection of fish species, supporting data robustness and reducing the likelihood of spurious results due to PCR variability (Fig. S3, Supplementary Information).

Application to other water systems.

Although this workflow was designed to improve fish detection in estuarine environments, it may be useful in other water systems. We tested an additional data set collected from freshwater streams in New England (Fig. 6) where turbidity and inhibition have not been noted. In these 32 samples the new approach detected a significantly higher number of fish ASVs (p-value < 0.001) with an average of 16 for the test control (KAPA.CTRL) and 34 with the new protocol (Platinum.ZYMO). Those ASVs were assigned to 41 species for KAPA.CTRL and 46 for Platinum.ZYMO. This may be useful when surveying rare species or other organisms where amplifying read counts could enhance detection.

The final recommended workflow (Fig. 7) includes a bead-based extraction performed on a robotic system, inhibition removal, multiplexed primers with Platinum HiFi polymerase and triplicate PCR (Fig. S3). The addition of the Zymo cleanup, a more expensive polymerase and triplicate PCR increases the overall costs; researchers and managers should consider project goals and site conditions when selecting a laboratory protocol. Our goal was to combine existing, proven methods into a protocol that optimizes detection of fish species in estuaries. All the components of this workflow have been used in other studies, and the steps are expected to be familiar and readily adapted by labs conducting eDNA work. The PCR steps can be completed by hand or on a robotic system, allowing high throughput processing of large sample numbers.

This workflow optimizes fish detection with the MiFish 12S primer. Adding additional primers targeting other mitochondrial regions ^20,21 would likely increase the number of species detected.

Method Validation

To validate our workflow, we tested the automated DNA extraction and the three PCR protocols in Table 1 on samples collected from four estuaries. Forty-eight samples were collected during the same period (May 2023) from four different locations within each estuary (12 from each site) to ensure that seasonal or time-related variables do not confound across-site comparisons.

We then reviewed the results to evaluate the steps that provide the greatest benefit in terms of number of fish species and fish Amplicon Sequence Variants (ASVs). For clarity and consistency, we use the term species to refer to both Operational Taxonomic Units (OUT

s) and instances where ASVs could not be matched to the species level but were assigned to a higher taxonomic level (e.g. the genus Fundulus).

Field collection

The samples used in this study were collected from a range of NERRS sites (see above) as part of a pilot project incorporating eDNA into a long-term multi-site monitoring network. All the samples were 1-liter water samples, collected from the surface, then transported to a nearby lab for filtering. Each sample was filtered through 1.5um porosity glass fiber filters and filters were replaced if clogging occurred. The filters were stored in pre-prepared tubes containing 4ml Longmire’s buffer (Longmire et al. 1997) to preserve the samples. Up to 3 filters from the sample were included in one tube.

Laboratory analyses and sequencing

All lab work was conducted in dedicated Biosafety Level 2 (BSL2) molecular laboratory facilities at the University of New Hampshire. All steps from sample extraction through sequencing were performed in separate, dedicated spaces within the building. We adhered strictly to standard laboratory cleaning practices for BSL2 including separation of Pre and Post-PCR laboratory spaces, physical separation of PCR Mix preparation, and DNA addition space. All experiments were conducted under laminar flow hoods and Biosafety cabinets for sample preparation, DNA extraction, purification and amplification. All surfaces were treated with 70% ETOH, 10% bleach and dH2O and were exposed to UV light before and after experimentation.

DNA extraction

Extraction Method Comparison.

Beads-based extraction methods have been successfully applied in many eDNA studies ^22,23. We adapted this technique for use on a KingFisher Flex System (Thermo Fisher, Waltham, MA). We compared the Kingfisher protocol ¹⁸ to a Qiagen-based extraction which we have used in previous studies ²⁴ to evaluate effectiveness.

KingFisher automated extraction: Sample vials containing filters in 400ml Longmire's buffer were received and entered into the lab tracking system. 10 μl of Proteinase K (0.02X) was mixed with 90 μl of Longmire buffer (LM) and added to each sample vial bringing the total volume to 500 ml. The mixture was incubated at 56°C for 90 minutes. For DNA isolation, 320 μl of Illumina SPRI paramagnetic beads (CAT: 20060057) were added to the lysed samples to ensure a 1:8 sample-to-beads ratio. DNA extraction was performed using The KingFisher Flex System (Thermo Fisher, Waltham, MA), which included two 80% ethanol washes, followed by elution in 100 μl of 10mM Tris-HCl. To see our detailed protocol visit protocol.io¹⁸.

Qiagen-based extraction: Filters were placed in a lyse and spin basket with 400 ml of buffer ATL and 20 μl of proteinase K and incubated at 56 ◦C for one hour, then centrifuged. The remainder of the filter extraction was performed on a QIAcube Connect system (QIAGEN®, Hilden, Germany) following the QIAamp DNA Mini protocol (Qiagen Cat. 51326). This kit is designed for extraction of DNA from tissue and is readily adapted to the Qiacube automated system. Although the reagents are proprietary the manufacturer notes that the method removes inhibition and contaminants.

Samples from both methods were eluted to 100 μl. DNA concentration was determined using the Qubit dsDNA HS Assay kit (Thermo Fisher, Waltham, MA), per the manufacturer's instructions. To monitor contamination, extraction blanks, consisting of H2O and plain buffer without filter, were extracted with each sample set. Once extracted, 50 μl of the sample was diluted 1:5 and stored in a -20C freezer for use in this project. The remaining sample was archived in a -80C freezer for potential future use.

Inhibition Removal. ZYMO OneStep-96 PCR Inhibitor Removal Kit (CAT: D6035) was used to remove inhibition from all samples following the manufacturer's instructions. Briefly, for 96 well plates, 150 µl of Prep Solution were added to each Silicon-A™-HRC Plate well, incubated for 5 minutes, and then centrifuged at 3,500 x g for 5 minutes. Subsequently, 50 µl of DNA was added to the prepared plate, mounted on a new Elution Plate, and centrifuged at 3,500 x g for 3 minutes. The filtered DNA was used for subsequent PCR reactions.

PCR Verification Amplification of target regions was verified on 2% E-Gel electrophoresis (Thermo Fisher, Waltham, MA, CAT: G820802). Absent or muted bands in samples with verified DNA concentrations indicate that the samples are likely to contain inhibiting agents. E-Gel images are also used as an initial screen for successful amplification of positive controls, and absence of contamination in negative controls.

PCR Amplification - Platinum. We found that the touchdown profile reduced off target amplification, but for many sites the target band was still effectively hidden by amplification of bacteria. To increase target amplification, we used Platinum SuperFi II polymerase. The Platinum enzyme was designed for higher fidelity and specificity and increased resistance to PCR inhibitors and to have a universal primer annealing temperature of 60°C which allows multiplexing PCR. However, we found that some samples amplified using the Platinum polymerase still exhibited inhibition. In these cases, adding an additional inhibition removal step improved amplification. Amplification Protocol: The PCR reaction was established in a total volume of 20 μl, incorporating 10 μl of Platinum SuperFi II Master Mix (2X), 2 μl of each equimolar primer mix (MiFish-U and MiFish-E, 5 uM each), 4 μl of genomic DNA, and 2 μl of PCR-grade H2O. Nextera adapters were added to each primer. The amplification followed a Touchdown (TD) PCR strategy to reduce non-specific amplification. The protocol began with a lid temperature set at 105°C, maintaining a reaction volume of 20 μl. The initial denaturation step was at 95°C for 3 minutes, followed by a cycling phase starting at 94°C for 30 seconds, an annealing step at 69.5°C decreasing by 1.5°C per cycle for 30 seconds, and an extension at 72°C for 1 minute and 30 seconds for 13 cycles. This was succeeded by an additional cycling phase: 94°C for 30 seconds, 60°C for 30 seconds, and 72°C for 45 seconds for 25 cycles, concluding with a final extension at 72°C for 10 minutes and a hold at 4°C indefinitely. A step-by-step protocol in available on protocols.io ^18,19.

PCR replicates.

The value of PCR replicates on fish species detection, was tested with the optimized protocol incorporating Platinum and Zymo clean up. Samples from each of the four sites were tested. For each sample, three replicate PCRs were performed. These replicates were then sequenced as separate samples.

PCR Amplification KAPA. Our initial protocol used KAPA HiFi HotStart (KAPA) and a touchdown (TD) thermal profile modified from Pitz et al (2020). The cycling conditions were the same as described for Platinum above except that the annealing temperature of the second cycling phase was set to 55°C. The PCR reaction was established in a total volume of 12 μl, incorporating 6 μl of KAPA Master Mix (2X), 0.7 μl of the forward and 0.7 μl of the reverse MiFish-U primers, 2 μl of genomic DNA, and 2.6 μl of PCR-grade H2O.

Primers. The MiFish-U primer was designed to amplify bony fish species while the MiFish-E ³ targets elasmobranch (sharks and rays). Both types are important in estuarine systems, and multiplexing these primers expands the list of potential target species.

Table 2 shows the primer and adapter sequences used in this study.

Table 2. Primer & adapter sequences

MiFish-MIX-F (forward)

MiFish-U-F: 5′- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTCGGTAAAACTCGTGCCAGC-3′

MiFish-E-F: 5′- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTTGGTAAATCTCGTGCCAGC-3′

MiFish-MIX-R (reverse)

MiFish-U-R: 5′- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG CATAGTGGGGTATCTAATCCCAGTTTG-3′

MiFish-E-R: 5′- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-CATAGTGGGGTATCTAATCCTAGTTTG-3′

Library preparation and Sequencing

PCR products were submitted to the Hubbard Center for Genome Studies (HCGS) for library preparation and sequencing, on an Illumina Novaseq 6000 instrument. Briefly, the amplicon products from the 1st PCR were prepared by incorporating dual-index barcodes and Illumina sequencing adapters (P5 and P7) into the DNA fragments. The PCR amplification conditions were set as follows: initial denaturation at 94°C for 3 minutes, followed by 15 cycles of denaturation at 94°C for 20 seconds, annealing at 55°C for 30 seconds, and extension at 72°C for 15 seconds. A final extension phase was conducted at 72°C for 7 minutes to ensure complete amplification of target sequences. The amplified libraries were purified using BluePippin to selectively remove short fragments and primer dimers.

Sequence Processing and Taxonomic Assignment. Fastp v 0.23.0 was used to trim the poly-g tails and low-quality sequences (mean quality < 25, window size 2 bp from the raw, demultiplexed fastq files ²⁵. The demultiplexed fastqs were then imported to qiime2 (qiime2-2022.8) for adapter trimming, denoising, and taxonomic classification ²⁶. The cutadapt plugin in qiime2 was used to trim primer sequences and filter out read-pairs that do not contain the priming sequences ²⁷. Read-pairs were retained if the MiFish primer sequences matched the allowed error rate of 0.1. Retained read-pairs were then denoised, merged, and checked for chimeras with the DADA2 qiime2 plugin ²⁸. Forward and reverse reads were truncated to 110 and 105 base-pairs respectively and were required to have at least 12 base-pairs (qiime2 default) of overlap to merge the forward and reverse reads. To reduce the number of false positive unique ASVs, we used the ‘pseudo’ pooling method where samples are initially denoised independently, and then denoised a second time with prior knowledge of the ASVs that are found in at least two samples in the initial denoising step. We then classified the ASVs with the ‘classify-consensus-vsearch’ and the ‘sci-kit learn’ feature classifiers in qiime2 using the MitoFish database⁵. We ran feature classification at default parameters except we set the number of accepted reference sequences (percent-identity) to 90% (default is 80%), and the number of accepted sequences to ‘all’ instead of the top 10 (maxaccepts). The BLAST search results were retained for confirming classifications for each ASV.

Phylogenetic Analysis

Amplicon Sequence Variants (ASVs) identified from each treatment were aligned using the MAFFT software ²⁹. The alignment was optimized under the E-INS-i algorithm using a gap open penalty of 1.5 and an offset value of 0.1, with the scoring matrix fine-tuned using a kappa value of 50. These settings were chosen to enhance the alignment accuracy of closely related fish ASVs. The aligned ASVs were used to reconstruct phylogenetic trees through the maximum likelihood method implemented in RAxML ³⁰. The trees were generated under the General Time Reversible (GTR) model with a gamma distribution to accommodate rate variation across sites. This modeling choice is particularly effective for sequences exhibiting high variability, as is common with environmental DNA samples from diverse fish populations.

Statistical Analyses

A one-way Analysis of Variance (ANOVA) was performed separately for Fish OTUs and Fish ASVs using the `aov` function in R to determine if the observed differences using our optimized protocol were statistically significant from the other treatments. The null hypothesis for each ANOVA was that the means of Fish OTUs and Fish ASVs for all treatment groups were equal.

The primary objective of the statistical analysis was to evaluate the effects of different treatments on both Fish OTUs and Fish ASVs. The treatments included were KAPA.CTRL (control), KAPA.ZYMO (with Zymo clean-up), and Platinum.ZYMO (with Zymo clean-up). The p-values for the treatment comparisons were extracted from the ANOVA results. If the p-value was less than 0.05, indicating significant differences among the treatment groups, a post-hoc Tukey's Honest Significant Difference (HSD) test was conducted using the ‘TukeyHSD’ function in R to identify which specific treatment groups differed from each other. All statistical analyses were conducted using R software (version 4.3.2) with the ‘aov’ function for ANOVA and the ‘TukeyHSD’ function for post-hoc comparisons. To further validate our optimized protocol on data collected from stream waters in New England, we used a T-test to determine if there was a statistically significant difference between the control treatment "KAPA.CTRL" and the optimized protocol "Platinum.ZYMO".

Author’s Contribution

A.W., F.E.B. conceived the experiment, designed the methodology and led the writing of the manuscript; F.E.B conducted the experiments and optimization steps. M.K. and W.K.T conducted the initial DNA extraction optimization. J.M. and J.S. conducted the bioinformatic analyses. H.G. conducted initial sample processing. A.W., F.E.B, J.M. analyzed data. J.S. and W. K.T analyzed stream data. All authors contributed critically to the drafts and gave their final approval for publication.

Data availability statement

The raw datasets used and analyzed in this study are available on the NCBI Sequence Read Archive (SRA) under submission number SUB14620788. Supplementary data (Version v3), including ASVs and comprehensive lists of detected fish species, can be accessed on Zenodo at https://doi.org/10.5281/zenodo.12753119.

Acknowledgment

We thank the NERRS Research Collaborative, NOAA, specifically the following NERRS sites, San Francisco Bay, Padilla Bay, Mission Aransas and Jacques Cousteau.

Bioinformatic analyses were supported by New Hampshire- INBRE through an Institutional Development Award (IDeA), P20GM103506, from the National Institute of General Medical Sciences of the NIH.

Conflict of Interest Declaration

The authors declare no conflict of interest.

Blackman, R. et al. Environmental DNA: The next chapter. Mol Ecol 33, (2024).
Nagarajan, R. P. et al. Environmental DNA Methods for Ecological Monitoring and Biodiversity Assessment in Estuaries. Estuaries and Coasts vol. 45 2254–2273 Preprint at https://doi.org/10.1007/s12237-022-01080-y (2022).
Miya, M. et al. MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: Detection of more than 230 subtropical marine species. R Soc Open Sci 2, (2015).
Kawato, M. et al. Optimization of environmental DNA extraction and amplification methods for metabarcoding of deep-sea fish. MethodsX 8, (2021).
Zhu, T., Sato, Y., Sado, T., Miya, M. & Iwasaki, W. MitoFish, MitoAnnotator, and MiFish Pipeline: Updates in 10 Years. Mol Biol Evol 40, (2023).
Pitz, K., Truelove, N., Nye, C., Michisaki, R. P. & Chavez, F. Environmental DNA (eDNA) 12S Metabarcoding Illumina MiSeq NGS PCR Protocol (Touchdown) V.2. (2020) doi:10.17504/protocols.io.bcppivmn.
Stoeckle, M. Y. et al. Current laboratory protocols for detecting fish species with environmental DNA optimize sensitivity and reproducibility, especially for more abundant populations. ICES Journal of Marine Science 79, 403–412 (2022).
Holmes, A. E. et al. Evaluating environmental DNA detection of a rare fish in turbid water using field and experimental approaches. PeerJ 12, (2024).
Kumar, G., Farrell, E., Reaume, A. M., Eble, J. A. & Gaither, M. R. One size does not fit all: Tuning eDNA protocols for high- and low-turbidity water sampling. Environmental DNA 4, 167–180 (2022).
Hunter, M. E., Ferrante, J. A., Meigs-Friend, G. & Ulmer, A. Improving eDNA yield and inhibitor reduction through increased water volumes and multi-filter isolation techniques. Sci Rep 9, (2019).
Schrader, C., Schielke, A., Ellerbroek, L. & Johne, R. PCR inhibitors - occurrence, properties and removal. Journal of Applied Microbiology vol. 113 Preprint at https://doi.org/10.1111/j.1365-2672.2012.05384.x (2012).
Stoeckle, M. Y. et al. Current laboratory protocols for detecting fish species with environmental DNA optimize sensitivity and reproducibility, especially for more abundant populations. ICES Journal of Marine Science 79, 403–412 (2022).
Rensch, T., Villar, D., Horvath, J., Odom, D. T. & Flicek, P. Mitochondrial heteroplasmy in vertebrates using ChIP-sequencing data. Genome Biol 17, (2016).
Macher, T. H. et al. Beyond fish edna metabarcoding: Field replicates disproportionately improve the detection of stream associated vertebrate species. Metabarcoding Metagenom 5, 59–71 (2021).
Shirazi, S., Meyer, R. S. & Shapiro, B. Revisiting the effect of PCR replication and sequencing depth on biodiversity metrics in environmental DNA metabarcoding. Ecol Evol 11, 15766–15779 (2021).
Stauffer, S. et al. How many replicates to accurately estimate fish biodiversity using environmental DNA on coral reefs? Ecol Evol 11, 14630–14643 (2021).
Miya, M. Environmental DNA Metabarcoding: A Novel Method for Biodiversity Monitoring of Marine Fish Communities. (2021) doi:10.1146/annurev-marine-041421.
El Baidouri, F. et al. Automated eDNA Extraction from Estuarine Samples Using Magnetic Beads. (2024) doi:10.17504/protocols.io.5jyl82jn9l2w/v1.
El Baidouri, F., Gilbert, H. L., Watts, A., El, F. & Unh, B. 12S PCR Metabarcoding Protocol for Fish Detection in Estuarine Samples. (2024) doi:10.17504/protocols.io.3byl49wqogo5/v1.
Riaz, T. et al. EcoPrimers: Inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Res 39, (2011).
Evans, N. T. et al. Fish community assessment with eDNA metabarcoding: Effects of sampling design and bioinformatic filtering. Canadian Journal of Fisheries and Aquatic Sciences 74, 1362–1374 (2017).
Guthrie, A. M., Nevill, P., Cooper, C. E., Bateman, P. W. & van der Heyde, M. On a roll: a direct comparison of extraction methods for the recovery of eDNA from roller swabbing of surfaces. BMC Res Notes 16, (2023).
Sanches, T. M. & Schreier, A. D. Optimizing an eDNA protocol for estuarine environments: Balancing sensitivity, cost and time. PLoS One 15, (2020).
Crane, L. C., Goldstein, J. S., Thomas, D. W., Rexroth, K. S. & Watts, A. W. Effects of life stage on eDNA detection of the invasive European green crab (Carcinus maenas) in estuarine systems. Ecol Indic 124, (2021).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. Fastp: An ultra-fast all-in-one FASTQ preprocessor. in Bioinformatics vol. 34 i884–i890 (Oxford University Press, 2018).
Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology vol. 37 Preprint at https://doi.org/10.1038/s41587-019-0209-9 (2019).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17, (2011).
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13, (2016).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol 30, (2013).
Stamatakis, A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, (2014).

No competing interests reported.

SupplementaryInformation.pdf

Download PDF

Reviews received at journal
11 Sep, 2024
Reviewers agreed at journal
31 Aug, 2024
Reviewers invited by journal
12 Aug, 2024
Editor assigned by journal
03 Aug, 2024
Editor invited by journal
29 Jul, 2024
Submission checks completed at journal
23 Jul, 2024
First submitted to journal
14 Jul, 2024

You are reading this latest preprint version

An Optimized eDNA Protocol for Fish Tracking in Estuarine Environments

Status:

Version 1

Abstract

Figures

Introduction

Results & Discussion

Conclusions and recommended workflow

Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1