As much as 97% of a viral community (virome) from the phyllosphere, i.e. the areal region of different plant species, are novel viral species[1]. The number of uncultivated viral genomes is rapidly increasing in databases, creating a need for high-quality genomes from isolated phages in order to properly analyze viromes. Although the methodology for isolating single phages is time-consuming and biased towards specific, plaque-forming vira, it is an important task that will further promote the study of metaviromes[2, 3]. Erwinia is a major bacterial genus, its members are often present as commensals and pathogens in the phyllosphere[4]. Erwinia billingiae is often identified as an epiphyte that can compete as an antagonist against the plant pathogen E. amylovora, the causal agent of fire blight in apple and pear trees, but E. billingiae has also been observed to be pathogenic towards plants[5–9]. Currently, just 12 phage genomes targeting E. amylovora are published in NCBI, and only two target E. billingiae. High-quality genomes of isolated phages targeting representative and dominating bacteria from the phyllosphere, such as Erwinia, are limited in the nucleotide databases[1, 10]. Hence, isolating and describing novel phages against Erwinia species will broaden the taxonomic knowledge and elevate the quality of databases. In this study, we present three newly isolated E. billingiae phages with a thorough comparative genomic analysis.
The two Erwinia strains, E. billingiae AM1 (isolated with Pseudomonas Isolation Agar[11]) and E. billingiae AM23 (isolated with King’s B media[12]) were isolated from bark of two chestnut trees (Aesculus hippocastanum) with bleeding canker symptoms (Fredens Park, Copenhagen, DK, 2020). DNA was extracted using the Genomic Mini AX bacteria kit (A&A Biotechnology, Gdańsk, PL). Illumina and Nanopore sequencing libraries were prepared as previously described[2]. Nanopore data was base-called with Guppy v3.6.0 and adapters trimmed with Porechop v0.2.4. Assemblies were performed by Unicycler v0.4.8[13]. The bacterial genomes were classified to the E. billingiae species using GTDB-Tk[14] tool with average nucleotide identity (ANI) ≥ 95% (Same species cut-off) and alignment fraction (AF) ≥ 89% to the closest hit for both strains[15, 16]. These two strains were used to isolate Snitter, Pecta, and Zoomie, from common household organic waste (HCS A/S, Glostrup, DK, 2020). The isolation, DNA extraction, sequencing library, sequencing, and genome assembly was done as previously described[17, 18]. The random fragmentation without the use of transposons in the NEBNext® library preparation enables prediction of phage genomic termini with the read start coverage tool in CLC Genomics Workbench v12.0.3. Putative gene calling and annotation was done as previously described[17, 18]. A tRNA search was performed with tRNAscan-SE[19]. Transmission Electron Microscope (TEM) pictures were obtained as previously described[20]. Phylogenetic analysis (MEGAX v10.2.5, ClustalW alignment with neighbor-joining and 1000 bootstrap replications) based on protein similarity to the large subunit of the terminase (terL) sequence was performed with the five closest relatives identified by BLASTp (identity ≥ 56.5%, query coverage ≥ 93%) [21]. Nucleotide intergenomic similarities (NIS) were calculated using VIRIDIC with the closest relatives to each phage identified by BLASTn (identity: ≥60.2%, query coverage ≥ 10%)[22]. Proteomic tree analysis was done using the ViPTree server (standard settings)[23] Phage structural proteins were identified and analysed as previously, by first making a tryptic digest, then analysing the resulting peptides with LC-MS and identifying the proteins using an in-house protein database (Proteome discover 2.2, Thermo Fisher)[24, 25].
The genome size, GC content, tRNA, and coding DNA sequences (CDS) of the phages are summarized in Table 1. From TEM images, Snitter has siphovirus morphology with an approximately 168 nm flexible tail, a central tail spike, and six visible tail fibers with distal globular appendages. Pecta and Zoomie both have podovirus morphology with a short tail stop and no appendages (Fig. 1A). The following taxonomical classifications are based on the International Committee on Taxonomy of Viruses (ICTV) guidelines. The criteria for intermediary-level classification into family or order are proposed to be based on complete viral proteomes like the ViPTree proteomic analysis[26]. The NIS cutoff for phages to be clustered into the same genus is > 70%. A full list of BLASTn results and the ViPTree analysis are available in online resource 1 and 2.
Snitter has < 26% NIS with its closest relatives, Escherichia phage vB_EcoS_W011D (NC_054893.1, 25.8% NIS), Escherichia phage vB_EcoS_PHB17 (NC_054892, 25.6% NIS), and Escherichia phage vB_EcoS_Chapo (MT682715.1, 24% NIS) (Fig. 1B). All belong to the Drexlerviridae family, but W011D and PHB17 both belong to the Tempervirinae subfamily while Chapo belong to the Tunavirinae subfamily (TableS1.1). The highest translated sequence similarity (SG) from the proteomic analysis with ViPTree is 36.18% between Snitter and PHB17 (TableS2.1), but they do not form a clade in the phylogenetic proteome tree (Fig.S2.1) or the phylogenetic tree based on the terL protein sequence (Fig. 1C). Pecta also only has distant relatives (NIS < 22%) in databases. The closest related to Pecta, Citrobacter phage CVT22 (NC_027988.2, 22% NIS), Pseudomonas phage GP100 (LT986460.2, 9.3% NIS), and Salinivibrio phage CW02 (JQ446452.1, 9.8% NIS) all belong to the Zobellviridae family (TableS1.2). Pecta had the highest proteomic SG at 35.24% to CVT22 (TableS2.2) and they also form a monophyletic group in the proteome tree (Fig.S2.2), in addition to closer related terL (Fig. 1C). For Zoomie the NIS to the Pantoea phage LIMElight (NC_019454) is 60.2%, but only 16.3% to the Enterobacter phage phiKDA1 (NC_027980.1), and 17.1% to the Shigella phage HRP29 (NC_048174.1)(Fig. 1B). HRP29 and phiKDA1 are from the sub-family Slopekvirinae, and all three closest relatives belong to the Autographviridae family, however each belong to different genera within that family and infect widely different hosts (TableS1.3). The highest proteomic SG at 60.26% is between Zoomie and LIMElight (TableS2.3), they form a distinct monophyletic group in the proteomic tree (Fig.S2.3), and the phylogenetic tree based on terL with a bootstrap value of 100 (Fig. 1C).
To sum up, Pecta, Snitter, and Zoomie each constitute three individual, new genera for which currently no other members from the NCBI database exist (Fig. 1B).
Predicted putative functions could only be assigned to 28 out of 81 CDS in Snitter, 14 out of 87 CDS in Pecta, and 17 out of 54 CDS in Zoomie. No bacterial virulence or antibiotic resistance genes were predicted (VirulenceFinder 2.0[27] and ResFinder 4.0[28]). A full list of predicted putative functions for all three phages is available in online resource 3. Proteome analysis for all three phages supports in silico prediction of genes encoding structural related proteins (8–84% coverage and 1–61 peptides). Tables of the proteome data is available in online resource 4. No integrase, integrase-related, or recombinase-related genes were predicted in Pecta or Zoomie indicating an exclusively lytic lifestyle, similar to CVT22 and LIMElight, [29, 30]. However, CDS28 in Snitter is predicted to be a phage-associated recombinase. The protein has a high similarity to the erf recombinase in the temperate Salmonella phage P22 (NP_059596.1)[31]. The presence of a recombinase, similar to those found in temperate phages, could indicate a temperate lifestyle. However, recombinases are also common in exclusively lytic phages including the closest relative W011D, which does not appear to be temperate[32].
The genomic synteny of gene clusters is conserved between the three phages and their closest relatives. Snitter has a large cluster of tail-related CDS (CDS32-40), and a smaller cluster of head-related CDS (CDS43, 48–50) with synteny shared across the closest related phages (CVT22, GP100 and CW02). Furthermore, a cluster of a DNA N-6-adenine-methyltransferase (CDS20), a nuclease (CDS21), a helicase (CDS22), primase/helicase (CDS24), ssDNA-binding protein (CDS27), and an exonuclease (CDS29) follows the structural proteins. The lysis-related CDS (CDS10 spanin and CDS11 lysin)[33] have a predicted methylase (CDS5) relatively close. Lastly, there is the terS (CDS51), terL (CDS52), a polynucleotide kinase (CDS61), and a kinase (CDS60)(Fig. 2A).
Pecta has the fewest predicted functions (Table 1). It contains several CDS encoding tail-related proteins, such as tail fiber-related protein (CDS1), tail fiber assembly (CDS2), terL (CDS5), portal (CDS6), and tail adaptor (CDS11), as well as a major capsid (CDS9). No other structural-related CDSs’ were predicted. However, Pecta has an endolysin (CDS19), spanin (CDS21), endonuclease (CDS22), two exonucleases (CDS32 and 43), a DNA polymerase (CDS49), and a ligase (CDS66) predicted (Fig. 2B). The DNA polymerase in Pecta has 54% translated sequence identity (Clinker) to the DNA polymerase in CVT22, but 0% for GP100 and CW02, yet the locus of the DNA polymerase is conserved in all four phages (blue dot with asterisk Fig. 2B).
Zoomie encodes a DNA and a RNA polymerase (CDS33 and CDS20), with high translated sequence similarity and synteny with its closest relatives (LIMElight, HRP29, and phiKDA1)(Fig. 2C). A podovirus morphology and a RNA polymerase used to be considered defining features of the Autographviridae (auto-graphein), which Zoomie indeed possess[34, 35]. Zoomie has a structural related cluster with a tail fiber (CDS8), internal virion (CDS11), two tail tubular (CDS12 and 13), major capsid (CDS15), and a scaffolding-related protein (CDS16). In regards to predicted enzymes, Zoomie has a holin (CDS2), a spanin (CDS3), a kinase (CDS22), an endonuclease (CDS23), an exonuclease (CDS26), a helicase (CDS34), and a primase (CDS35)(Fig. 2C).
In conclusion, Zoomie, Pecta, and Snitter are all novel Erwinia phages with limited NIS to closest relatives (≤ 60%), which according to the ICTV demarcation guidelines (< 70% NIS) constitute their own novel genera with no known other members currently in the NCBI database (Fig. 1B). They are assigned to existing families by whole-genome DNA sequence and proteomic analysis. Based on the closest relatives we suggest that “Snittervirus” belongs to the subfamily Tempevirinae, family Drexlerviridae, “Pectavirus” belongs to the Zobellviridae family, and that “Zoomievirus” clusters in a potential sub-family together with the genus Limelightvirus within the Autographviridae family.