HV69-70 and Del156-157/R158G mutation assay development for Alpha and Delta variants. The HV69-70 and Del156-157/R158G mutation assays were developed in silico using Primer3Plus (https://primer3plus.com/) to target mutations present in alpha and delta variants of concern, respectively. These mutations were chosen because they are present in high percentages of the associated variant sequences in GISAID, and they are deletions; PCR assays targeting deletions are generally more specific than those targeting single nucleotide polymorphisms (SNPs). Mutation sequences were obtained from outbreak.info/compare-lineages. The parameters used in the development (that controlled sequence length, GC content, and melt temperatures) are provided in the Supplementary Information (SI). Primers and probe sequences are provided in Table 1 and schematic of the assays are provided in Figure S1, S2, and S3.
Primers and probe sequences were screened for specificity in silico using NCBI Blast, and then tested in vitro against a virus panel (NATtrol™ Respiratory Verification Panel, Zeptomatrix) that includes several influenza and coronavirus viruses, “wild-type” gRNA from SARS-CoV-2 strain 2019-nCoV/USA-WA1/2020 (ATCC® VR-1986D™) which does not contain the mutations (hereafter referred to as WT-gRNA), heat inactivated SARS-CoV-2 strain Alpha (SARS-CoV-2 variant B.1.1.7, ATCC® VR-3326HK™), and synthetic gRNA from Twist Biosciences (South San Francisco, California, USA) for Beta (Twist control 16), Gamma (Twist control 17), Kappa (Twist control 18), and Delta (Twist control 23) variants. Details on sample processing are provided in the SI. The sensitivity and specificity of the mutation assays were further tested by diluting variant gRNA containing the mutations in no (0 copies), low (100 copies), and high (10,000 copies) background of WT-gRNA (details in SI).
Wastewater sample collection. Two publicly owned treatment works (POTWs) that serve populations of Santa Clara County, California, USA (San Jose, SJ) and Sacramento County, California, USA (Sacramento, SAC) were included in the study. Each serves approximately 1,500,000 residents; further descriptions of these POTWs can be found in Wolfe et al.14.
Samples were collected by POTW staff using sterile technique in clean, labeled bottles. Approximately 50 ml of settled solids were collected each study day from each POTW. Settled solids samples were grab samples at SAC. At SJ, POTW staff manually collected a 24 h composite sample. Samples were immediately stored at 4°C and transported to the lab and processed (within 6 hours of collection).
Samples were collected daily for a larger wastewater surveillance effort starting in October 202014, and a subset of these samples are used in the present study. The dates of samples included in this study were chosen to span the period of time prior to and including the presumed emergence of Alpha and Delta variants in the communities. Prior to presumed emergence of Alpha and Delta variants, 1-4 samples per month were analyzed, and during periods thought to correlate with emergence and spread, sampling was daily to three times per week. At SJ, 133 and 48 samples were included for HV69-70 and del156-157/R158G analyses, respectively. At SAC, 64 and 48 samples were included for HV69-70 and del156-157/R158G analyses, respectively. Samples from as early as July 2020 and as late as August 2021 were included.
Samples were processed as described in Wolfe et al.21. Briefly, solids were dewatered by centrifugation and a dry weight measurement taken. Solids were resuspended in a DNA/RNA shield solution (Zymo Research) spiked with Bovine Coronavirus (BCoV) as a positive recovery control and homogenized. Prior to 26 May 2021, RNA was extracted from the homogenized sample. Thereafter, the final vortexing step was omitted and RNA was extracted from the resultant supernatant. Comparative analysis suggests that removing the final vortexing step reduced variability among replicates. RNA was extracted using the Chemagic™ Viral DNA/RNA 300 Kit H96 for the Perkin Elmer Chemagic 360 followed by PCR Inhibitor Removal with the Zymo OneStep-96 PCR Inhibitor Removal Kit. RNA was subsequently processed immediately (within 24h of sample collection) to measure concentrations of the N gene, PMMoV, and BCoV recovery. The N gene codes for the SARS-CoV-2 nucleocapsid and the specific region of the genome targeted by our assay is conserved on all SARS-CoV-2 genomes. PMMoV is highly abundant in human stool and domestic wastewater globally22,23 and is used here as an internal recovery and fecal strength control for the wastewater samples. Subsequently, RNA samples from San Jose (SJ) were stored between 0 and 7 days before they were run for HV69-70 and the N gene in a multiplex assay. RNA samples from SJ and Sacramento were stored between 15 and 190 days prior to being analyzed for del156-157/R158G and the N gene; and RNA samples from Sacramento were stored between 15 and 300 days prior to being analyzed for HV69-70 and the N gene. The N gene was run a second time in all samples as a check for RNA degradation during storage.
Further details are available in the SI.
ddRT-PCR. Digital droplet RT-PCR assays for the mutations and N gene targets (see Table 1 for primer and probe sequences, purchased from IDT) were performed on 20 µl samples from a 22 µl reaction volume, prepared using 5.5 µl template, mixed with 5.5 µl of One-Step RT-ddPCR Advanced Kit for Probes (Bio-Rad 1863021), 2.2 µl Reverse Transcriptase, 1.1 µl DTT and primers and probes at a final concentration of 900 nM and 250 nM respectively. Droplets were generated using the AutoDG Automated Droplet Generator (Bio-Rad). PCR was performed using Mastercycler Pro with cycling conditions described in the SI. Droplets were analyzed using the QX200 Droplet Reader (Bio-Rad). All liquid transfers were performed using the Agilent Bravo (Agilent Technologies).
Positive controls consisted of BCoV and PMMoV gene block controls (dsDNA purchased from IDT), gRNA of SARS-CoV-2 (strain 2019-nCoV/USA-WA1/2020, ATCC® VR-1986D™), gRNA of SARS-CoV-2 Alpha lineage (SARS-CoV-2 variant B.1.1.7, ATCC® VR-3326HK™), and gene block controls for Del156-157/R158G. Details of the number of wells assayed for samples and controls are listed in the SI. Results from replicate wells were merged for analysis. Thresholding was done using QuantaSoft™ Analysis Pro Software (Bio-Rad, version 1.0.596). In order for a sample to be recorded as positive, it had to have at least 3 positive droplets. For the wastewater samples, three positive droplets corresponds to a concentration between ~500-1000 cp/g; the range in values is a result of the range in the equivalent mass of dry solids added to the wells.
For the wastewater samples, concentrations of RNA targets were converted to concentrations per dry weight of solids in units of copies/g dry weight using dimensional analysis. The total error is reported as standard deviations and includes the errors associated with the Poisson distribution and the variability among the 10 replicates. The recovery of BCoV was determined by normalizing the concentration of BCoV by the expected concentration given the value measured in the spiked DNA/RNA shield. BCoV was used solely as a process control; samples were rerun in cases where the recovery of BCoV was less than 10%. All wastewater data, along with additional reporting details per the MIQE24 and EMMI25 guidelines are available publicly at the Stanford Digital Repository (https://doi.org/10.25740/zf117dn1545).
Incident COVID-19 cases and COVID-19 case isolate sequences. Sewershed service area boundary shapefiles were provided by each POTW. Sewershed boundaries were compared to zip code boundaries (2020 US Census TIGER/Line shapefiles) in ArcGIS Pro version 2.7.3 (ESRI, Redlands CA). All ZIP codes with >50% of land area within sewershed boundaries were considered to be within the POTW service area and used for further analysis. The number of PCR-confirmed COVID-19 cases reported to CDPH as a function of episode date (earliest of either specimen collection date or date of symptom onset) residing within each sewershed was determined using methods reported previously21.
COVID-19 case isolate whole genome sequence data used in this study were sourced from results provided to CDPH by the CDC, which included results through laboratory partners contracted to sequence case isolates across the United States. Sequence data were assigned to a sewershed based on ZIP code provided for the sample, with the time of sample collection assigned as date. The PANGO lineage was assigned based on the version available at the time data was extracted, with the most recent results using pangoLEARN and pango-designation v1.2.6626.
Estimates of VOC abundance between February 1 and July 31, 2021 were calculated by dividing the number of sequences identified as Alpha or Delta (according to WHO definition and including all PANGO sublineages Q.* and AY.*, respectively) by the total number of isolates sequenced from individuals residing in the sewersheds with specimens collected over the previous 14 days. To estimate the proportion of total COVID-19 cases with an isolate that was sequenced, we divided the total number of isolates sequenced by the number of COVID-19 cases reported to CDPH, by episode date, over the same 14-day period and sewershed. To estimate the time needed between isolate sample collection, sequencing, and report receipt, and the impact of that time delay on VOC estimates for the sewershed community, we compared changing 14-day VOC abundance estimates over time against a final estimate that was generated after the end of the analysis period (August 24, 2021).
Pearson correlations were performed between the wastewater mutation and case isolate variant data sets, comparing the mean ratio of mutation/deletion(s) in wastewater (HV69-70 and del156-157/R158G to the N gene) to the proportion of case isolates sequenced that were characterized as alpha or delta, each averaged over the previous 14 days. Statistical significance was set at p<0.05; analyses were carried out using R studio Version 1.4.1106.