In this study, we identified viruses predicted to infect nitrogen-fixing symbionts of the genus Mesorhizobium from root nodules in agriculturally relevant, globally dispersed chickpea plants. We leveraged the fact that viruses infecting Mesorhizobium would likely be integrated into Mesorhizobium genomes as prophages or could be binned into Mesorhizobium metagenome assembled genomes (MAGs) due to similar genomic signatures and/or similar relative abundance patterns [5]. To identify putative Mesorhizobium-infecting phages, we searched for viral contiguous sequences (contigs) within a total of 197 MAGs and isolate draft genomes of Mesorhizobium cultured from chickpea root nodules. The nodules were collected in India, Ethiopia, Morocco, Turkey, Australia, Canada, and the USA [9]. Of 798 contigs identified as viral, 364 were longer than 10 kbp, thus adhering to common guidelines [25, 26]. After clustering the > 10 kbp viral contigs at 95% average nucleotide identity, we retained 106 viral operational taxonomic units (vOTUs). The number of recovered vOTUs is low compared to recent soil viral assemblage studies that have recovered thousands of vOTUs [27], probably because we focused only on viruses likely to infect Mesorhizobium. Moreover, we focused on root nodules, which are an intrinsically low diversity environment [11] and thus likely to have fewer vOTUs than more diverse environments such as soil. Finally, we mined vOTU sequences from metagenomes dominated by bacterial sequences; other studies focused on the viral size fraction via viromics are substantially enriched in viral sequences [28].
To determine whether the phages identified by our targeted approach are known to infect bacteria other than Mesorhizobium, we calculated a gene-sharing network, using all known virus genomes from RefSeq. Of 106 vOTUs, 61 were clearly grouped into 28 viral clusters (reflecting ICTV genus-level), 25 of which consisted only of vOTUs from our study. Four nodule phages were included in the remaining three clusters, alongside phages of Achromobacter, Rhizobium, Burkholderia, Loktanella, and Ruegeria (sup. Figure 1). The genus Rhizobium includes some nodule-forming legume symbionts, and Burkholderia species were previously identified in legume nodules [11]. Of the remaining 45 vOTUs that were not clearly clustered, 30 were outliers (likely related to a cluster but at a higher taxonomic level such as family), six were singletons (had no shared genes with other viral genomes) and the other nine were ambiguously clustered (unclear whether they are outliers or clustered, therefore their taxonomic level could not be determined). Therefore, in combination with the fact that these nodules were dominated by Mesorhizobium, we conclude that the majority of vOTUs we identified likely infect Mesorhizobium, and for simplicity, we refer to them as putative Mesorhizobium phages.
While all of the cultured symbionts belonged to the genus Mesorhizobium, both the diversity and the identity of Mesorhizobium strains had a geographic signature, with higher diversity in countries that employ less agricultural management practices and low diversity in agricultural soil from countries where the use of industrial inoculants of Mesorhizobium is widespread [9]. We hypothesized that there would be analogous biogeographical patterns in the composition of Mesorhizobium phage assemblages in nodules. To test this hypothesis, we determined phage assemblage composition per nodule by mapping reads from nodule metagenomes to the 106 vOTUs. Similarly, we also mapped reads from Mesorhizobium isolates to vOTUs to investigate whether these symbionts harbored associated viruses with genomes that were not assembled. In total, we characterized viral communities in 914 short read archives representing 648 nodule metagenomes and 266 cultured Mesorhizobium isolates [9]. 166 of 266 Mesorhizobium isolates contained assemblages of 1 to 9 vOTUs per isolate. 90% of the 648 nodule metagenomes were dominated by Mesorhizobium. 414 (64%) of them contained Mesorhizobium phage assemblages represented by a range of 1 to 16 vOTUs per sample (mean 2.2), and 214 of these contained only a single vOTU (Fig. 1A). The majority of the vOTUs were found in samples collected in Ethiopia (38 vOTUs; N = 65), followed by India (13 vOTUs; N = 40). Twelve additional vOTUs were identified both in Ethiopia and in India (Fig. 1B). This pattern of a higher number of unique Mesorhizobium vOTUs is in accordance with higher phylogenetic diversity of Mesorhizobium in root nodules from the same countries [9], which is likely related to farming practices (Fig. 1C). Specifically, in Ethiopia and India, growers generally do not rely on the inoculation of commercial strains of Mesorhizobium, whereas in the USA, Canada, and Australia, they do [29, 30]. Inoculated commercial strains presumably outcompete local Mesorhizobium strains, therefore reducing the richness of Mesorhizobium strains in nodules, and thus likely lowering richness of Mesorhizobium phages, or even potentially phage clearing through the process of commercial cultivation [9]. Finally, the high richness of putative Mesorhizobium vOTUs in Ethiopia and India could be influenced by a greater sampling effort in those countries (a total of 319 and 288 samples, respectively, relative to 299 total from all other countries, sup. table S1), a bias we addressed.
To further consider whether Mesorhizobium vOTU diversity was higher in countries that relied on natural, rather than commercial, Mesorhizobium inocula, we generated accumulation curves of the number of detected vOTUs as a function of sampling effort per country (Fig. 1D). While vOTU richness in Ethiopia continued to increase with sampling effort, that of India appeared to saturate, indicating that collecting more samples in India would be unlikely to yield more vOTUs. Finally, the number of expected unique vOTUs was consistently higher in India and Ethiopia compared to other countries (Fig. 1D). Therefore, we conclude that the higher number of unique vOTUs in India and Ethiopia represents higher viral richness rather than an artifact of sampling effort.
The number of vOTUs shared between multiple countries was generally low (39 total vOTUs were present in more than one country), and 74 of 106 (70%) vOTUs were found only in one chickpea species (sup. Figure 2). However, two vOTUs were present in more than five countries across three continents (Asia, Africa, and Europe). We compared these vOTUs to 14 full genomes of Mesorhizobium to verify that these vOTUs were not integrative conjugative elements (ICE) of the Mesorhizobium symbiosis island misidentified as phages. ICEs are an important tool used by Meshohrizobium in its symbiosis with the chickpea plant, and they bear a resemblance to viral DNA [9]. The two relatively ubiquitous vOTUs were not located in the symbiosis island, and therefore we conclude that they likely represent real phages rather than artifacts. Assuming that the ubiquity of these viral populations does not stem from contamination in the laboratory or sequencing pipeline (we have no reason to believe that it does), these viruses may be descendants of a phage that infected an ancestor of the current Mesorhizobium strains (e.g., a symbiont of Cicer arietinum before the dispersion of chickpea plants from Turkey across continents). Indeed, these vOTUs were found both in C. arietinum and in C. reticulatum. It is also possible that the vOTUs dispersed (e.g., together with their hosts) more recently and experienced environmental selection to be associated with root nodules.
To determine whether the two ubiquitous vOTUs were detected in genomes of multiple strains of Mesorhizobium, we compared them to the 197 Mesorhizobium MAGs and draft genomes and searched for alignments longer than 5000 bp and with at least 95% nucleotide identity. The two vOTUs were identified in samples with multiple dominant Mesorhizobium strains from multiple clades [9] (specifically, clades 5, 6, 7, 8, and 9 and 1, 2, 5, 7, 8, and 9, respectively). The presence of vOTUs in strains from multiple clades may support the hypothesis of their ancestral origin, or they may otherwise have a broad host range.
Beta-diversity of Mesorhizobium phage assemblages could be affected by biotic or abiotic parameters. Pairwise dissimilarity of phage assemblages was lower between nodules dominated by the same strain of Mesorhizobium than for nodules dominated by different strains, though it remained quite high in both cases (Fig. 2A; Wilcoxon rank sum test p < 0.001). We detected a stronger effect of the dominant Mesorhizobium strain on viral assemblage composition (Fig. 2B; ANOSIM p = 0.001, R2 = 0.25), compared to the weaker biogeographic signature of country of sample origin (Fig. 2C; ANOSIM p = 0.04, R2 = 0.08). Furthermore, environmental factors (soil pH, mean annual temperature, mean annual precipitation, altitude, latitude and longitude) were significantly correlated to viral assemblage composition but explained very little of the variability (Mantel test, p = 0.02, r = 0.1). As the nodule is a specialized environment separate from the surrounding soil, viruses within it are more likely to be influenced by the internal nodule conditions and microbial assemblage than by external biotic and abiotic conditions.
Because nodules are formed by a single bacterial cell recruited from soil, we hypothesized that, in order to be transferred from the surrounding soil into the nodule, the majority of Mesorhizobium phages found in nodules should be lysogenic (i.e., capable of being incorporated into the host genome and brought to the nodule along with the host). To address this hypothesis, we searched for markers of a lysogenic lifestyle in the 106 vOTUs. While the presence of a viral genome within a MAG may imply lysogeny, oftentimes viral contigs are erroneously binned into MAGs based on genomic signatures and abundance patterns, in which case the entire contig could be viral without flanking host regions. Therefore, we considered only vOTUs that had flanking host regions, as well as an integrase gene, as putatively capable of lysogeny. Although we could only demonstrate that 39 out of 106 vOTUs (37%, Fig. 1B) were likely lysogenic (or 67% if requiring only an integrase gene but no flanking host regions), our methods for detection were relatively conservative and could have missed lysogenic lifestyles in some cases. In addition, viruses can be carried within a bacterial cell without integrating into its genome, for example through pseudolysogeny or chronic infection [31]. In addition to transport as a viral genome inside a host cell, other pathways by which viruses could enter the nodule could be as free viral particles through the plant phloem [32] or by “hitchhiking” (e.g., via non-specific attachment to the cell) on other bacteria that colonize the nodule after its creation.
To determine whether the viruses might affect metabolism of Mesorhizobium, we examined annotations of auxiliary metabolic genes in vOTUs. To avoid spurious annotations, we searched for genes that appeared repeatedly in at least five phages. However, all of the annotations that were shared between more than five vOTUs were for encoded proteins that likely have a function in the viral life cycle, such as DNA replicases or lysozymes. Similarly, we hypothesized that Mesorhizobium phages may carry genes related to signaling between the host plant and nitrogen-fixing symbionts. Thus, we searched for the presence of NCR (nodule cysteine-rich) and LCR (low complexity region) peptides [33], which are used by legumes to modulate bacterial growth and development during symbiosis in the nodule [34]. However, we did not identify these genes in vOTUs. Evidence for extensive AMG activity in these Mesorhizobium phages was not found.
In conclusion, chickpea nodules contain phages that may provide a top-down control on the abundance of the nitrogen-fixing symbiont Mesorhizobium. Despite being initially identified within Mesorhizobium MAGs and being detected within nodules, most vOTUs were not definitively identified as likely lysogenic, implying the potential for other mechanisms of transport into the root nodule, either as a genome inside the host cell or as free viral particles. The diversity of viral communities was strongly affected by the dominant Mesorhizobium strain in the nodule, further supporting that these viruses infect Mesorhizobium. Two vOTUs were identified in nodules from five different countries and in Mesorhizobium MAGs from multiple clades, implying global dispersal of these phages and/or the possibility that they may be descendants of ancestral lysogenic Mesorhizobium phages that predated intercontinental dispersion of chickpea plants.