Patterns of E. coli abundance associated environmental and urban infrastructure features. To gain insight into the ecological processes modulating the population structure of E. coli, we assessed the joint and independent contributions of urban features and local stream habitat conditions across 14 stream reaches (locations) distributed throughout a heavily polluted urban stream network with heterogeneous built-in sanitation and drainage infrastructure (Fig. 1). The characterization of urban features included demographic, hydraulic (impervious surfaces, drainage density and road density) and sanitary conditions (sanitary sewers, septic tanks and potable water) (Table 1). Local habitats were characterized for their physicochemical profile (pH, conductivity, temperature, dissolved oxygen, DIN, SRP, DOC, turbidity, chlorides and iron) and for local hydraulic and habitat parameters of the stream section (physical dimensions, water flow, flow velocity and macrophyte coverage) (Table 1). Spatial predictors were also included in the analysis to account for directional and correlated effects (see Appendix A1 of the Supplementary Material). E. coli abundance was quantified at each location and the phylogenetic affiliation of isolates was assigned by molecular methods.
Table 1
Basic statistical summary of environmental (unshaded) and urban infrastructure measured parameters (shaded). DIN, dissolved inorganic nitrogen; SRP, soluble reactive phosphorus; DOC, dissolved organic carbon; SD, standard deviation.
| Range | Mean | SD |
pH | 5.27–8.02 | 7.36 | 0.74 |
Conductivity (µS/cm) | 762-1,533 | 1,078 | 189 |
Temperature (ºC) | 17.4–23.1 | 19.6 | 1.7 |
Dissolved oxygen (mg/L) | 2.5–10.0 | 4.1 | 2.0 |
DIN (mg/L N-NO3−) | 5.6–20.0 | 12.4 | 4.3 |
SRP (mg/L P-PO43−) | 0.8–7.3 | 2.3 | 1.7 |
DOC (mg/L) | 6.2–77.9 | 18.0 | 18.0 |
Turbidity (NTU) | 6-109 | 34 | 33 |
Chloride (mg/L) | 38–107 | 70 | 22 |
Iron (µg/L) | 23.5–337 | 135 | 91.6 |
Macrophyte coverage (%) | 0–92 | 38 | 31 |
Water flow (m3/s) | 3.0.10− 2-3.8 | 0.9 | 1.2 |
Flow velocity (m/s) | 3.21.10− 2-0.43 | 0.21 | 0.12 |
Depth (cm) | 28–80 | 46 | 16 |
Drinking water coverage (%) | 0.01–31.1 | 13.4 | 12.5 |
Sanitary sewer density (dwelling/ha) | 0.0-16.2 | 3.5 | 5.4 |
Septic tank density (dwelling/ha) | 0.1–28.9 | 14.0 | 9.9 |
Impervious surface (%) | 5–75 | 53 | 27 |
Drainage density (m/km2) | 0–3,837 | 1,519 | 1,235 |
Road density (m/km2) | 1,248 − 22,369 | 13,983 | 7,323 |
Population density (Hab./ha) | 0,3-135,0 | 68,7 | 45,4 |
E. coli (cfu/ml) | 3–10,700 | 2,735 | 3,732 |
The locations surveyed had a wide variability in environmental and urban infrastructure across the hydrological network (Table 1 and Fig. 1). E. coli abundance exceeded the level recommended by the US-EPA 20121 of 235 cfu/100 ml for recreational freshwaters at all studied sites, which poses a great risk for people in direct contact with the studied streams. The analysis of urbanization proxies (infrastructure coverage, impervious surface coverage and population density) allowed to characterize the watershed as predominantly urban (mean impervious surface of 53%), with a low-urbanized area located in the headwater of Las Piedras stream. The rest of the basin is highly urbanized and values of impervious surface are higher downstream (> 70%). However, the coverage of sanitary infrastructure services, such as drinking water or sewerage, is heterogeneous across the watershed, reflecting differences in the level and quality of urbanization. A PCA revealed a gradient of urban infrastructure along the watershed (Fig. 2a) and grouped locations into distinct clusters based on their features. A first cluster includes locations from the upper section of Las Piedras stream, with higher macrophyte coverage and DIN, together with lower infrastructure coverage. A second cluster groups locations from the upper section of the San Francisco stream, which mainly show intermediate infrastructure coverage. Lastly, a third cluster is composed of locations from densely populated areas with a high development of hydraulic and sanitary infrastructure, which are associated with lower levels of nutrients and high flow velocity. One location in the Santo Domingo stream (SD1) was not grouped in any of the clusters identified, mainly associated with elevated levels of DOC, E. coli abundance and turbidity.
A network of global Pearson’s correlations was used to explore co-variation patterns between variables (see Figure S4, Supplementary Appendix A3 for the overall significance analysis). The obtained network was analyzed in terms of the influence or relative importance of each variable (see Fig. 2b and Table S3 in Appendix A3 of the Supplementary Material for full-network metrics). Urban infrastructure variables were ordered close to each other by their positive and significant correlations, mainly interacting through the proportion of impervious surface (expected influence, |EI|=4.80), drinking water coverage (|EI|=3.65) and population density (|EI|=3.45). Among environmental variables, flow velocity (|EI|=2.48), water flow (|EI|=2.46), and with lower relative influence, macrophyte coverage (|EI|=1.91) concentrated the links with urban infrastructure, mainly interacting with sanitary sewer density (|EI|=2.95) and impervious surface. A second branch of interactions was evidenced between impervious surface and conductivity (|EI|=2.54) and chlorides (|EI|=1.52). Moreover, pH (|EI|=4.82) and DOC (|EI|=3.60) represented key variables within the environmental matrix, gathering the largest number of links. Local environmental variables such as DOC, SRP, Iron, conductivity, chlorides and DIN were also clustered in a second positive and significant co-variation group. E. coli abundance was significantly and positively associated with nutrients such as SRP and DOC, and with physicochemical conditions such as pH (-) and conductivity (+), suggesting that urbanization parameters may indirectly influence E. coli abundance through environmental factors. Finally, a cluster of strong positive and negative significant correlations was found among sanitary sewer density, pH, water flow, turbidity, depth and macrophyte coverage.
Spatial distribution of E. coli phylogenetic groups. A total of 326 environmental isolates were characterized based on the Clermont's multiplex PCR method for E. coli phylogroup annotation. We detected most of the phylogenetic groups, except for phylogroup C. Phylogroup A was the most abundant in all locations sampled, with a relative frequency of up to 50% at most sites, followed by phylogroup B1 (Fig. 3). Mean relative frequency was 64% for phylogroup A, 16% for B1, 10% for D, 5% for F/G, 3% for E and 2% for B2 (Fig. 3b). Notably, a single isolate of the cryptic clade IV (further confirmed by a specific multiplex PCR assay for the identification of cryptic clades within the genus Escherichia) was collected in location SF5. To our knowledge, this is the first report of a cryptic clade member in surface waters of South America.
A correlation analysis of community structure metrics applied to E. coli phylogenetic composition showed that phylogroup richness (α) was negatively correlated with total mean abundance of E. coli (coefficient of correlation ρ=-0.56; p = 0.045) (Fig. 3c). In addition, Pielou’s evenness index (J´) was positively correlated with richness (coefficient of correlation ρ = 0.77; p = 0.002).
Disentangling the effects of environmental, urban and spatial factors on patterns of E. coli abundance and phylogenetic composition. A variance partition analysis (Fig. 4) showed that important independent effects of environmental and urban infrastructure predictors (23% and 18% of the fraction shared with the spatial matrix, respectively) affect the distribution of E. coli abundance. At the same time, the three predictor sets together explained 13% of the total variance. The spatial matrix of AEMs showed a strong pure contribution (27%), while environmental and infrastructure matrices exhibited remarkably lower pure contributions (< 3%) (Fig. 4a). In terms of overall influence, 42% of total contribution to the explained variance was related to the environmental matrix and 39% to the urban infrastructure matrix, indicating a similar contribution of both sets of predictors to the observed patterns of E. coli abundance. The high contribution of spatial factors in shared and pure fractions indicates that E. coli abundance has a strong directional spatial structure, potentially reflecting distinct spatially structured processes (i.e., different hydrological, environmental and infrastructure patterns).
To further assess the contribution of spatially structured environmental and urban infrastructure variables, a pRDA was performed, where each explanatory set was conditioned by the other (Fig. 4a). Results showed that both environmental (F7,31=4.48, P < 0.05) and urban (F3,31=7.38, P < 0.05) matrices were significant in explaining E. coli abundance. Within the environmental set, significant variables were macrophyte coverage (score = 0.43; F1,31=12.38, P < 0.05) and DIN (score = 0.43; F1,31=12.81, P < 0.05). In regard to the urban infrastructure matrix, sanitary sewer density had a significant contribution (score = 0.51; F1,31=18.02, P < 0.05). As well, several coarse- and fine-grade AEMs of the spatial matrix had significant pure contributions (Fig. 4a; see more details in Appendix A1 of the Supplementary Material, Table S1).
In contrast to the results obtained for abundance, spatial variability in phylogroup composition in the watershed was mostly explained by pure contributions of the three matrices (Fig. 4b). The contribution of spatial factors was the largest (52%), followed by environmental (28%) and urban infrastructure factors (15%). Partial RDA of environmental variables, controlling for the effect of urban infrastructure and spatial AEMs, was statistically significant (F7,24=23.01, P < 0.05). The following variables were significant: DIN (F1,24=16.07, P < 0.05), turbidity (F1,24=28.61, P < 0.05), SRP (F1,24=7.91, P < 0.05), depth (F1,24=8.84, P < 0.05), flow velocity (F1,24= 11.30, P < 0.05) and macrophytes coverage (F1,24=8.73, P < 0.05) (Fig. 4b). The first two pRDA axes were also found to be significant: RDA1 accounted for an explained proportion of 50% (F1,26=100.97, P < 0.05) and RDA2 of 19% (F1,26=37.13, P < 0.05). RDA1 was mainly positively related to turbidity (score = 0.41), and negatively related to macrophyte coverage (score=-0.35) and flow velocity (score=-0.13) (Fig. 5a), while RDA2 was positively related to DIN (score = 0.40), SRP (score = 0.29) and depth (0.27). Phylogroups D, E, F and B2 were positively correlated with RDA1, while B1 was strongly and negatively associated with it, followed by A, thus distinguishing between dominant and less frequent phylogroups. Also, B1 was also positively associated with RDA2, while A was negatively related to it, suggesting slightly different optimal conditions for the two dominant phylogroups. Among less frequent phylogroups, B2 and F/G showed a positive correlation with RDA2 axis, suggesting their association with higher nutrient levels.
In addition, partial RDA of urban infrastructure predictors, controlling for the effect of environmental and spatial variables was statistically significant (F3,24= 24.18, P < 0.05), with a significant association of phylogroup composition with the densities of septic tanks (F1,24= 11.44, P < 0.05) and drainage (F1,24= 8.55, P < 0.05) (Fig. 4b). Significant axis RDA1 (F1,24= 57.12, P < 0.05), with an explained proportion of 59%, was positively related to drainage density (score = 0.34) and negatively associated with septic tank density (score=-0.22). Moreover, phylogroups B1 and B2 were positively associated with RDA1, while F, E and A were negatively associated with this axis (Fig. 5b). Axis RDA2, which accounted for 14% of the explained variance, was also significant (F1,24= 13.69, P < 0.05); and two significant variables (septic tank density, score=-0.28 and drainage density, score=-0.27) were negatively related to this axis. Phylogroup A was positively related to RDA2, unlike phylogroups B2, F, E and B1.
Finally, partial RDA of spatial factors, constrained by environmental and urban infrastructure, showed that all the included AEMs (ranging from broad- to fine-scale resolution levels) were significant in explaining spatial variability (see Table S1 and Figure S2 in Appendix A1 of the Supplementary Material for more information).