Cultivation of Candidatus Pelagibacter ubique infected with HTVC031P, HTVC027P, and HTVC023P
The SAR11 strain Candidatus Pelagibacter ubique HTCC1062 was kindly provided by Professor Stephen Giovannoni, Oregon State University, USA. HTCC1062 was cultured in artificial seawater-based ASM1 medium supplemented with 1 mM NH4Cl, 100 µM KH2PO4, 1 µM FeCl3, 100 µM pyruvate, 50 µM glycine, and 50 µM methionine38. HTCC1062 cultures were incubated at 17℃ without shaking and light. Exponentially growing HTCC1062 cultures were infected with HTVC023P, HTVC027P, and HTVC031P independently at a phage-bacteria ratio of approximately 10:1 in triplicates. Cell mortality was monitored using the Guava EasyCyte flow cytometer (Millipore, USA). When cell mortality was detected, samples were fixed with formaldehyde (1% final concentration) for 1 h at room temperature and filtered on 0.2 µm polycarbonate filters (Merck Millipore, Burlington Massachusetts, US). In a repeated experiment, Ca. P. ubique HTCC1062 were infected with HTVC027P and HTVC031P, as described above. Samples were taken from three time-points (2 hours after infection; approximately 2 hours before cell lysis; approximately 6 hours after cell lysis) and fixed and filtered as described above.
Environmental sampling
Samples were collected during the 2020 phytoplankton spring bloom from ~ 1 m depth at the long-term ecological research station Helgoland Roads (54° 11.3′ N, 7° 54.0′ E), German Bight15 (Table S6). Samples from the Atlantic and Southern Ocean were collected using a Seabird SBE 911 + CTD in 2022 during the R/V Polarstern cruises PS13223 and PS133/124, respectively39. Samples from the Pacific Ocean were collected with a Seabird SBE 911 + CTD during the R/V Sonne cruise SO24525. SAR11 cell counts from SO245 were retrieved from Reintjes et al.40. During the cruises, samples were collected from surface water and deep-chlorophyll maximum (DCM; Table S6). Samples were fixed with formaldehyde (1% final concentration) for 1 h at room temperature. Cells were immobilized on 0.2 µm polycarbonate filters (Merck Millipore, Burlington Massachusetts, US), which were stored at -20°C until further processing. The final sampling volume varied depending on total cell counts in the samples (Table S6).
Pelagiphage FISH probe design and synthesis
We designed direct-geneFISH41 probes based on alignments between each of the three isolates, namely HTVC027P, HTVC031P, and Greip, and PacBio Sequel II metagenomes from the 2020 spring phytoplankton bloom at Helgoland, North Sea18 (ENA PRJEB52999). Target probe-regions were identified from assembled contigs (Supplementary material and methods). Subsequently, probes were designed manually within Geneious (v2022.1.1)42. We aimed for 10–13 probes per phage of 156–318 bp length and a GC content similar to the phages (22.0–43.2%, mean ± sd: 32.9 ± 4.9%). Further, reference genomes and metagenome data from Helgoland Roads needed to share a minimum of 90% nucleotide identity for usability during FISH43. We aimed to target genes encoding terminases, polymerases, or structural proteins, as we expect higher conservancy in these genes. Ambiguous alignment with any other sequence was excluded, against the nr database using the NCBI BLAST webservice (14 February 2023).
Probes (Table S7) were ordered as “oPools” from integrated DNA Technologies (IDT, Coraville, Iowa, USA) and resuspended in water as directed by the manufacturer. Probes were labelled with the ULYSIS Alexa 594 conjugation kit (Invitrogen, Waltham, Massachusetts, USA) with minor modifications as described in Zeugner et al.43 and subsequently purified using Micro Bio-Spin chromatography columns P-30 (Bio-Rad, Hercules, California, USA). Labelling efficiencies were calculated as described by the manufacturer’s instructions for the ULYSIS kit, using a NanoDrop (Thermo Fisher Scientific, Waltham, Massachusetts, USA).
Fluorescence in situ hybridization
First, CARD-FISH targeting the 16S rRNA of SAR11 (“SAR11-mix”, Table S7) was conducted44. Secondly, samples were hybridized with equimolar amounts of the probes targeting HTVC027P, HTVC031P, and Greip, using direct-geneFISH as described earlier41,43 with minor modifications. Hybridization buffer with 25% formamide was used and no ethanol washing was conducted after FISH to prevent any loss of fluorescence signal. Hybridized filters were counter-stained with the DNA stain 4′,6-diamidino-2-phenylindole (DAPI; 1 µg ml− 1). Samples were embedded in ProLong Glass Antifade (Invitrogen, Waltham, Massachusetts, USA) for microscopy. As negative controls, samples of HTVC023P were hybridized with the probe mix, targeting all three phages. Additionally, environmental negative controls included samples which were not exposed to the phage-probe mix to account for any autofluorescence within cells.
Microscopy
Samples were imaged on a Zeiss AxioImager.Z2m, equipped with a charged-coupled device (CCD) camera (Zeiss AxioCam MRm, Zeiss, Oberkochen, Germany), and illuminated with a Zeiss Colibri 7 LED (excitation: 385 nm for DNA, 469 nm for 16S rRNA CARD-FISH, and 590 nm for direct-geneFISH signals). The microscope was equipped with a multi-Zeiss 62 HE filter cube (Beam splitter FT 395 + 495 + 610). Images were recorded with a custom-built macro45,46 within the Zeiss AxioVision software (Zeiss, Germany). A total of 120 fields of view per sample were recorded with a 63x Plan Apochromat objective (1.4 NA, oil immersion). For high-resolution imaging, we used a Zeiss LSM 780 (Zeiss, Oberkochen, Germany), with an ELYRA PS.1 detector upgrade. The microscope was equipped with a 63x plan apochromatic oil immersion objective and the excitation lasers 405 nm (DAPI), 488 nm (16S rRNA CARD-FISH), and 591 nm (direct-geneFISH).
Image cytometry
Quality control and automated cell counting of 8-bit greyscale images was done within the Automated Cell Measuring and Enumeration tool (ACME, available from https://www.mpi-bremen.de/automated-microscopy.html)45,46 with channel-specific settings (Table S2). Cells for total cell counts were defined by a DNA (DAPI)-specific signal. SAR11 cells were defined with an overlapping DNA and 16S rRNA (CARD-FISH) signal and phage-infected cells needed an additional phage (direct-geneFISH) signal. Zombies were cells with a phage signal but no 16S rRNA signal. We calculated the frequency of dividing cells – a proxy for cell-division rate – as previously described15 using the MicrobeJ plugin47 within Fiji/ImageJ48. In principle, a cell containing two local DNA maxima was counted as a dividing cell.
Metagenomic abundance estimates for SAR11 MAGs, 16S rRNA gene, and Pelagiphages
To determine the relative abundance of SAR11 metagenome-assembled genomes (MAGs) during the 2020 phytoplankton spring bloom, we performed a mapping analysis utilizing PacBio metagenomic reads obtained from the prokaryotic fraction (0.2–3 µm) across all 30 samples. The reference MAGs, classified under the order Pelagibacterales by gtdbk-tk (version 1.3.0, release 202)49, were initially derived from the same phytoplankton bloom metagenomes, described above18. Raw reads were mapped using the minimap2-pb50 algorithm, executed within the SqueezeMeta pipeline (version 1.3.1)51. The mapping outcomes were normalized using the reads per kilobase per million mapped reads (RPKM) metric, which considers both the length of the MAG and the library size of each sample. The RPKM value was determined using the formula \(RPKM= \frac{Reads mapped to SAR11 MAG*{10}^{6}}{total read in a sample*length of MAG in kilobase pairs}\).
For quantifying the abundance of the 16S rRNA gene, we extracted the full-length 16S rRNA sequences from metagenome assemblies using Barrnap (v0.9; https://github.com/tseemann/barrnap). Similar to the SAR11 MAGs, these sequences underwent mapping, and their relative abundance was computed using the RPKM method as described earlier.
To assess the relative abundance of the phages HTVC027P, HTVC031P, and Greip, mapping of metagenomic reads to the reference genomes were performed in a similar fashion. To facilitate a comprehensive comparison with our microscopy data, and considering the specificity of these phages to SAR11, we calculated the abundance of these pelagiphages relative to SAR11 community present during spring phytoplankton bloom in 2020. Therefore, phage relative abundance was determined as \(\frac{Reads mapped to phage genome*{10}^{6}}{\sum read mapped to all SAR11 MAGs*length of phage in kilobase pairs}\).
Identification of defence systems within SAR11 genomes
All available Pelagibacter ubique genomes (n = 172) available in the RefSeq database (from October 22nd 2023)52 and MAGs from the 2020 phytoplankton spring bloom (n = 14)18 were screened for anti-phage defence mechanisms. DefenceFinder53 was used with the default database from October 22nd 2023.
Statistics and modelling
Statistical analyses and corresponding visualizations were done in R (v4.2.2; for used packages see supplementary material and methods)54. Repeated measures ANOVA was used to test for the specificity of pelagiphages to SAR11. Zombies were detected in all bacteria, using the 16S FISH probe EUB338 I-III and compared to zombies in SAR11 (SAR11-mix, Table S7). Samples originated from the time-series and were not independent from each other. Thus, repeated measures ANOVA was chosen.
Bayesian beta regressions were applied to assess the relationship between (a) the abundance of phage-infected and zombie cells, with the model formula rel_infT ~ Zombie_cells and (b) abundance of zombie and SAR11 cells, using the model formula Zombie_cells ~ rel_SAR11_abundance. rel_infT is the relative infection rate, transformed by adding 0.001 because two values of the data (148 data points) originally contained 0 for which the beta distribution is not defined; Zombie_cells is the relative abundance of Zombie cells; rel_SAR11_abundance is the relative SAR11 abundance; and FDC_percent is the frequency of dividing cells. The model predictions were back-transformed from the logit-scale for the plots in Fig. S5.
-
We assumed a positive relationship between phage-infected cells and Zombie cells, as the 95% credible interval [4.083, 9.554] of the slope (on logit-scale) excluded 0 and all values of the posterior distribution for the slope were ≥ 0. The model predicts an intercept of -3.72 ± 0.1 and a slope of 6.92 ± 1.41 (on logit-scale; means ± SD).
-
A negative relationship was assumed between the relative abundances of zombie and SAR11 cells, as all values of the posterior distribution for the slope were < 0. The 95% credible interval was [-2.800, -0.953] (on logit-scale). The model predicted an intercept of -2.74 ± 0.1 and a slope of -1.87 ± 0.5 (on logit-scale; mean ± SD).
-
We assumed a positive relationship between phage-infected cells and Zombie cells, as the 95% credible interval [0.02, 0.05] of the slope (on logit-scale) excluded 0 and only 1 of the 8,000 values of the posterior distribution for the slope was ≤ 0. The model predicts an intercept of -3.72 ± 0.1 and a slope of 0.04 ± 0.01 (on logit-scale, mean ± SD).
In the model, flat priors (brms default) were used and 2000 iterations for 4 chains after a warmup period of 2000 iterations per chain.