Optimizing staining through fractional factorial experimental design
Our first objective was to test whether a fractional factorial experimental design can yield an improved protocol for FVM-based detection of viruses treated with a general nucleic-acid stain. We used the protocols developed by Brussaard et al. and Huang et al. as scaffolding for our design. Brussaard et al. recommended fixing the sample with glutaraldehyde at a final concentration of 0.5%, flash-freezing in liquid nitrogen, diluting in Tris-EDTA (TE) buffer, staining with SYBR Green I at a final dilution of 5 × 10-5 the commercial stock, and incubating the sample with the stain for 10 min in the dark at 80°C. Huang et al. concluded that better results for reclaimed-water samples could be obtained by using an 0.2% glutaraldehyde concentration, omitting flash-freezing, staining at room temperature for 15 minutes, using SYBR Gold instead of SYBR Green I, and staining at a final dilution of 1 × 10-4. We combined treatment steps from the two protocols into a 2IV6-2 fractional factorial experimental design (replicated 4x) to assess main and interaction effects of six two-level factors—(1) stain concentration, (2) staining temperature, (3) staining time, (4) additive, (5) diluent, and (6) stain type—on nucleic-acid staining of T4 for FVM analysis.
A representative suite of results plots is displayed in Figure 1. Collective results from the T4 optimization are also summarized graphically in Figure S1. A distinct target population was only visible for the eight glutaraldehyde-treated runs. Indeed, glutaraldehyde addition had a highly significant (p < 0.001) effect on total event count, mean fluorescence intensity (MFI; a measure of brightness achieved through nucleic-acid staining), and the fluorescence coefficient of variation (CV; a measure of the tightness of the target population). Adding glutaraldehyde increased the total sample event count by 65,402 events, increased MFI by 360 units, and decreased fluorescence CV by 9 percentage points.
There are three explanations for the glutaraldehyde-induced increase in sample event count:
- Glutaraldehyde increases the presence of fluorescent phantom events (e.g., colloidal particles [15]).
- Glutaraldehyde raises the fluorescence of non-target events (e.g., bacterial debris) above the fluorescence threshold.
- Glutaraldehyde raises the fluorescence of target events (here, T4) above the fluorescence threshold.
To test (1) and (2), we used FVM to compare untreated and glutaraldehyde-treated 0.2-mm filtered phosphate buffered saline (PBS) after staining with SYBR Gold. We also compared FVM data collected on untreated and glutaraldehyde-treated samples of the negative control (bacterial host propagated and purified without virus infection) stained with SYBR Gold. In neither case did FVM reveal a distinct target population, nor a substantial increase in event count, after glutaraldehyde addition. These results suggest that glutaraldehyde addition not only helps visibly separate the target signal from non-target events, but also increases the absolute number of target events detected through FVM. The target event count for the eight runs that incorporated glutaraldehyde was approximately 109–1010 events/mL: about an order of magnitude greater than the qPCR-based titer (108–109 gc/mL) and about two orders greater than the culture-based titer (107–108 PFU/mL). These discrepancies may be attributed to factors such as non-specific staining of particles (e.g., cellular debris) in FVM, losses during DNA extraction in PCR, and aforementioned challenges with plate-based culturing.
The fractional factorial design enabled quantification of main and two-way interaction effects for each factor tested. Results are shown in Figure 2 and Table S1. We performed this quantification first on all events within analysis bounds. Though the quantification analysis suggested the presence of numerous significant main effects as well as several significant two-way interaction effects between glutaraldehyde and other experimental factors, results were compromised by the fact that the analysis did not distinguish between target and non-target events. Because a distinct target population was only visible for glutaraldehyde-treated runs, and because our goal was to develop a staining protocol that most successfully separates the target population from background, we also performed the quantification using only data from target events identified in glutaraldehyde-treated runs.
No statistically significant two-way interaction effects were observed in the target-only analysis. However, including glutaraldehyde as a variable in the experimental design meant that only a small subset of two-way interaction effects between non-glutaraldehyde factors were analyzed. Future work could explore other possible two-way interaction effects. The target-only quantification analysis also did not identify any statistically significant main effects on MFI. Diluent was the only variable that had a significant main effect on event count: the main effect of using TE buffer instead of MQ water was -7,807 events with a p-value of 0.023. This result may be explained by the increased tendency of free stain to form colloids in low-ionic-strength water [16].
Stain temperature and diluent had strongly significant (p < 0.001) main effects on CV. Staining at 50°C decreased CV by 2.7 percentage points; using TE buffer decreased CV by 4.4 percentage points. Stain concentration had a strongly significant (0.001 < p < 0.01) effect on CV: staining at 1 x 10-4 times the sample volume increased CV by 1.8 percentage points. Stain time and stain type both had significant (0.01 < p < 0.05) main effects on CV. Staining for 15 minutes decreased CV by 1.2 percentage points; staining with SYBR Gold rather than SYBR Green I increased CV by 1.5 percentage points. We conclude that stain temperature and diluent are the most important sample-preparation factors besides glutaraldehyde addition. In other words, dilution in TE buffer and staining at 50°C meaningfully increases the “tightness” of the T4 fluorescence signal, thereby aiding discrimination of T4 from background.
We also conclude that using SYBR Green I (instead of SYBR Gold) and staining for 15 minutes (instead of 1 minute) could improve target discrimination of T4 slightly further. But these small potential gains must be weighed against drawbacks. Staining for one minute is more conducive to near-real-time FVM analysis than staining for 15. Moreover, SYBR Green I exhibits a large fluorescence enhancement upon binding to DNA but not RNA. A protocol using SYBR Green I will be less effective than SYBR Gold at detecting a wide variety of viruses, since the latter exhibits a large fluorescence enhancement upon binding to DNA and RNA. Future work could explore these tradeoffs for environmental samples.
Overall, our results suggest that a protocol for reliably identifying and quantifying T4 bacteriophage through FVM involves a combination of treatments recommended by Brussaard et al. and Huang et al. We recommend diluting the sample in TE buffer to achieve an FVM analysis rate of about 102–103 events/second, adding glutaraldehyde at a final concentration of 0.5%, and staining with either SYBR Green I or SYBR Gold (depending on whether the species of interest include both DNA and RNA viruses) at 5 x 10-5 times the sample volume at 50°C for at least 1 minute.
Automating data analysis through density-based clustering
Clustering approach
Our second objective of this study was to explore cluster analysis as an objective, automated alternative to manual gating. Specifically, we tested whether density-based clustering can aid and improve analysis of viral surrogates in complex matrices. The OPTICS algorithm developed by Ankerst et al. (1999) underlies the most widely used density-based clustering strategies [17]. OPTICS outputs all points in a dataset ordered by a characteristic “reachability distance”, and generates a reachability plot that can be used to identify clusters by looking for “valleys” of low reachability distance separated by “peaks” of noise. The most straightforward way to extract clusters from the reachability plot is to set a single global reachability threshold. Unfortunately, this approach fails when—as is often the case in real-world environmental samples—the number of targets and the spatial density of FVM data generated by those targets is variable (Figure S2).
Alternative options are (i) extracting clusters via manual selection of peaks and valleys on the reachability plot, or (ii) using an algorithm to perform the selection automatically (Figure S3). Ankerst et al. suggested extracting clusters automatically by identifying “steep up” and “steep down” areas on the reachability plot characterized by the ξ steepness parameter, but ξ must be laboriously tuned based on trial and error. The opticskxi package available in R [18] provides a variantcluster-extraction algorithm that “iteratively investigates the largest differences” in steepness until either a given number of clusters are defined or the maximum number of iterations is reached [19]. We compared results obtained through manual gating to results obtained through OPTICS combined with either manual or opticskxi-based cluster extraction[1] for two datasets, as described below.
Mixed-target experiment
A variety of microbiological targets may be present and of interest in a real-world setting such as a water-treatment plant. To test whether density-based clustering can accurately detect and quantify waterborne viruses alongside other specimens, we prepared a solution containing known concentrations of viral and non-viral targets in the submicron size range. The targets were φ6 and T4 bacteriophages as well as fluorescent polystyrene beads of 0.2, 0.5, and 0.8 mm in diameter. T4 was included as an environmentally relevant viral surrogate that generates a clear FVM signal; φ6 was included to represent viral classes that are not detectable through FVM as distinct populations but may still generate an indeterminate “virus-like particle (VLP)” signal [6]; and beads were included because they are highly uniform and similar in size to many viral and bacterial classes. Combining biological and engineered targets enabled us to test the performance of density-based clustering on a mixed-density dataset. We collected FVM data on 10 replicates of each of five dilutions of the mixed-target solution. The 0.8 mm bead component was kept undiluted as a control/reference.
Figures 3, 4, and 5 respectively illustrate results from manual gating, OPTICS ordering + manual extraction, and OPTICS ordering + opticskxi-based extraction of the mixed-target data. We note several features of the results. First, manual extraction labeled far more points as noise than did opticskxi. This is because for manual extraction, we separated valleys from peaks by setting cutpoints at the apparent “knees” of the reachability plot curves. Opticskxi, by contrast, tends to set cutpoints at or near the curve peaks.
Second, the different data-analysis strategies yielded somewhat different clusters. In manual gating we drew six gates: one for each of the three bead sizes, T4, φ6 and other virus-like particles (VLPs), and an additional apparent cluster corresponding to 0.5 mm bead doublets.[2] Neither manual extraction nor opticskxi identified a cluster matching the manual gates drawn for φ6/VLPs and for the 0.5 mm doublet. Manual extraction tended to identify events falling within these gates as noise, while opticskxi- tended to assign events falling into the φ6/VLP gate as part of the T4 cluster and events falling into the 0.5 mm doublet gate as part of the 0.5 mm bead cluster. Both OPTICS-based approaches frequently detected two separate clusters within the side scatter (SSC) vs. fluorescence (FITC) region designated by manual gating as corresponding to 0.2 mm beads. Inspecting the data revealed that some of the events exhibiting the same SSC and FITC signal intensity ranges exhibited meaningfully different FSC signal intensities.
To numerically compare results across the different data-analysis approaches, then, we established four “buckets” corresponding to (1) viruses (including T4, φ6, and other VLPs), (2) 0.2 mm beads, (3) 0.5 mm beads (including 0.5 mm doublets), and (4) 0.8 mm beads. Tables S2 and S3 show expected and average detected event counts across the three approaches for each bucket; Figure 6 plots these data. There were clear differences between the theoretical and detected event counts for each bucket. Event counts were higher than expected for the 0.2 and 0.5 mm bead buckets, slightly lower than expected for the 0.8 bead bucket, and much lower than expected for the virus bucket. The bead-bucket discrepancies can be explained by the fact that manufacturer-provided concentrations of the bead solutions were only approximate within an order of magnitude. Discrepancies for the virus bucket can be explained by the fact that φ6, a small and difficult-to-stain enveloped virus, emits only a faint FITC signal. A majority of the φ6 particles spiked into the mixed-target solution were likely not stained brightly enough to rise above the FITC limit of detection [6].
Focusing on detected event counts, results were generally consistent across all three data-analysis approaches for the bead buckets. For the virus bucket, event counts from manual gating and opticskxi were similar to each other but generally higher than event counts from manual extraction. This is because while engineered particles generate tightly grouped data of fairly uniform density, viral targets tend to generate more unevenly dispersed FVM data. Consider how each of our three data-analysis approaches considered handle the clusters associated with T4 and φ6. For manual gating, we established relatively large T4 and φ6 gates. Any point falling within these gates was hence categorized as part of the virus bucket. For the OPTICS-based methods, there was not a clear shift in reachability distance marking the transition from T4 to φ6/VLPs: reachability distance increased gradually towards the border of the T4 cluster, then increased at roughly the same rate as the T4 cluster border bled into the φ6/VLP region. This resulted in manual extraction and opticskxi delivering divergent results. Opticskxi tended to assign high-reachability-distance points included in a given reachability curve (points near the peak) to the same cluster as low-reachability-distance points (points near the valley). Since OPTICS placed many points corresponding to the T4 and φ6/VLP regions on the same reachability-plot curve, opticskxi assigned those points to the T4 cluster. By contrast, manually set cutpoints assigned points near the valley of the T4/φ6/VLP curve to the T4 cluster and points near the peak to noise.
Environmental-spike experiment
We performed a modified version of the mixed-target experiment to assess whether automated clustering can accurately detect and quantify waterborne viruses in a challenging environmental matrix, where the presence of an increased background signal could confound FVM analysis and/or alter the target signal.[3] Specifically, we spiked a T4/bead solution described above into tertiary-treated, 0.2 mm-filtered wastewater effluent diluted 10x. The T4/bead solution was the same as the one used in the mixed-target experiment, but with φ6 and 0.5 mm beads omitted. We also prepared an identical but unspiked solution for comparison. We again collected FVM data on 10 replicates each of the spiked and unspiked solutions, analyzing data by both manual gating and density-based clustering.
Figures 7, 8, and 9 illustrate results from manual gating, OPTICS ordering + manual extraction, and OPTICS ordering + opticskxi-based extraction of environmental-spike data. Manual gating identified the three targets: an 0.8 mm bead cluster, an 0.2 mm bead cluster, and for the T4-spiked sample, and a T4 cluster partially obscured by signal from the wastewater matrix but still clearly within the previously established T4 gate. Expected event counts were roughly in line with detected event counts obtained through manual gating, exhibiting the same discrepancies observed in the mixed-target experiment (Table S4). We also observed a low-SSC, high-FITC cluster in most of the replicate runs for both the T4-spiked and unspiked samples. The identity of particles in this cluster is unknown.
The two OPTICS-based clustering approaches yielded quite different results. As was also true for the mixed-target experiments, manual cluster extraction successfully detected the 0.8 mm bead cluster, the 0.2 mm bead cluster, and often a sub-cluster in the 0.2 mm bead zone corresponding to particles exhibiting similar FITC and SSC intensities but different FSC intensities. Manual extraction also detected one or more clusters in the low-FITC, low-SSC region corresponding to φ6/VLPs in the mixed-target experiments, and hence to background (including natural virus particles) in the wastewater matrix. Manual extraction did not typically clearly distinguish the T4 cluster, nor did it detect the low-SSC, high-FITC foreign cluster.
For opticskxi-based cluster extraction, the constraining k parameter meant that opticskxi did not yield as many clusters as manual extraction. Rather, opticskxi consistently detected an 0.8 mm bead cluster, a cluster that included the 0.2 mm beads but also many apparent noise points, and a cluster that included the T4/VLP/background region. The latter sometimes spilled over to include much of the 0.2 mm bead region. Opticskxi occasionally detected the higher-FSC sub-cluster in the 0.2 mm bead region, occasionally detected the low-SSC, high-FITC foreign cluster, and never detected a clearly distinct T4 cluster.
Because (i) the reachability plots from the environmental-spike data were so complex, (ii) we set manual gates exclusively based on the SSC vs. FITC pseudocolor density plot (i.e., without considering FSC), and (iii) of concerns (discussed further below) that OPTICS might over-weight FSC signal intensities for virus data, we also generated OPTICS orderings of the environmental-spike data using only the SSC vs. FITC dimensions. Figures S4 and S5 contain representative plots illustrating manual extraction and opticskxi results, respectively, using these reduced-dimension orderings. The reachability plots of these orderings were simpler but did not yield significantly better results, especially for detecting T4.