Species confirmation by 16S rRNA gene and MLSA
NCBI blast results show that B. glycinifermentans JRCGR-1 16S ribosomal RNA gene has 99.91% similarity with B. glycinifermentans BGLY (NZ_LT603683.1) and B. glycinifermentans SRCM103574 (NZ_CP035232.1). In the phylogenetic tree, both strains SRCM103574 and BGLY were present in the same branch (Fig. 1A). However, B. glycinifermentans JRCGR-1 was present in the same clade but as a single branch. Five strains of B. glycinifermentans (highlighted in blue) were present as two separate clades which were close to each other. B. licheniformis sp. was the closest, and in contrast, B. velezensis and B. subtilis sp. were farthest away from JRCGR-1 based on 16S RNA gene sequence analysis.
Fig. 1
As for strain, BGLY, SRCM103574, KJ-17-KR092216.1, KBN06P03352 and JRCGR-1, sequence identity with any species contained in the database was above 97%, suggesting that strain JRCGR-1 could represent as B. glycinifermentans sp. [27].
To further confirm the results of the 16S rRNA gene sequence analysis, we analyzed four housekeeping genes of B. glycinifermentans sp.: gyrA (Fig. 1B), gyrB (Fig. 1C), rpoB (Fig. 1D), and aptD (Fig. 1E). Each one of these genes was BLAST against the GenBank protein database (NCBI) [28]. The finding reveals that all the housekeeping genes of strain JRCGR-1 presented higher similarities (≥97%) with other B. glycinifermentans strains in the GenBank database. These results confirm that JRCGR-1 is classified as a B. glycinifermentans sp.
Characteristic features of strain JRCGR-1
The strain JRCGR-1 genome contained 4700692 bp with 45.5%. average G+C content and the final assembly contains 84 contigs and 83 scaffolds [9]. The annotation results revealed 5174 genes, 32 tRNA, 4 rRNA, 1 tmRNA and 92 misc_RNA. PlasmidSPAdes was used to assembled plasmid into 37 contigs. The plasmid size was found to be 1113267 bp and it contains tRNA: 27, rRNA: 4, gene: 1366, misc_RNA: 21, CDS: 1314 [9].
Whole-genome sequence comparisons
ANI (ANIm), TETRA, AAI, codon usage and DDH values were calculated for species identification. The strain JRCGR-1 ANIm (Fig. 2A), TETRA (Fig. 3A) and DDH (Fig. 3B) values with respect to three strains of B. glycinifermentans sp. (BGLY, KBNO6P03352 and SRCM103574) were in the range of 96.1-99.04%, 0.996-997, 73.5-84.7%, respectively. The strains of glycinifermentans sp. used in the current study are within the region of boundaries defined for genomic species;0.99 for TETRA, 70% for DDH and 95–96 for ANI [19]. These values qualify strain JRCGR-1 as members of a single genomic species and are different from any other Bacillus sp.
Fig. 2
Heat map analysis of AAI (Fig 2B) also shows that strain JRCGR-1 is glycinifermentans sp.
Among twenty strains of Bacillus sp., Bacillus sp. H15 (NZ_CP018249) and Bacillus sp. 1S-1 (NZ_CP022874) were not previously classified in any species. Our study (based on ANIb, ANIm, TETRA, DDH and AAI) shows that they should be considered as a B. licheniformis sp. from a genomic point of view.
Fig 3
Orthology analysis
Cluster analysis of the selected Bacillus sp. showed that 97.83% of genes were in orthogroups and only 2.2% of genes were unassigned to any group. Total orthogroups were found to be 6403 and the mean orthogroups size value was 13.5. The overall statics is given in Table S3.
Phylogenetic tree of selected Bacillus sp. based on ortholog clustering shows that strain JRCGR-1 and BGLY are closely related strains (Fig. 4A). It is interesting to note that, the highest number of genes were present in strain JRCGR-1, which was followed by B. sonorensis strain SRCM10139 and three other B. glycinifermentans species (Fig. 4B). For each Bacillus sp., more than 90% of the genes were present in the orthogroups and the number of orthogroups containing species was found to be greater than 60% (Fig. 4C). Moreover, very few genes were unassigned to any orthogroups. Specifically, for strain JRCGR-1, out of 5045 genes, 4670 genes were present in the orthogroups (92.6%) and 375 genes were not assigned to any group (7.4%) (Fig. 4B and 4C). Fig. 4D shows several genes in species-specific orthogroups and the number of species-specific orthogroups.
Fig. 4
Synteny analysis
Synteny allows us to study the evolution between genomes, functional conservation [29-31], detect genome rearrangements [31], aid genome annotation [32] and helps to detect the quality of the genome assembly. In the current study, we illustrate a comparison featuring four primate genomes; strain JRCGR-1 was used as reference genome and B. sonorensis strain SRCM101395 and B. licheniformis strain ATCC 14580 as a query genome. Fig. 6 clearly shows that there is significant chromosomal synteny between the reference genome and the query genomes. For instance, total out of 83 contigs of stain JRCGR-1, 57 contigs shared synteny blocks with stain B. glycinifermentans BGLY genomes (Fig. 5A). Similarly, the reference genome also exhibits a high level of synteny with multiple locations in B. sonorensis strain SRCM101395 and B. licheniformis strain ATCC 14580 genomes (Fig. 5B and Fig. 5C). But, in the case of B. licheniformis strain ATCC 14580, the arrangement of synteny was different with reference to B. glycinifermentans BGLY and B. sonorensis strain SRCM101395 genomes.
These results indicate that the core genomes of the B. glycinifermentans strains are conserved and B. sonorensis is the closest species to B. glycinifermentans sp.
Fig. 5
Pan and core gene
Pangenome analysis has been used for the evaluation of the genome diversity, species evolution, pathogenesis and other characters of microorganisms [33]. Therefore, to better understand the bacterial evolution and the phylogenetic relationship, we analyzed the Bacillus sp. pangenome.
Fig. 6A shows that the number of new genes does not converge to zero upon sequencing of additional genomes. The future rate of the discovery of novel genes in a species can be predicted by analyzing the increase/decrease in the pangenome size with the addition of a new genome sequence [34, 35]. Fig. 6B shows the evolution of the pan and core genomes. The pangenome size is computed at 8329 genes (n = 20) and shows an open genome characteristic: (i) with the addition of genomes, the trajectory of the pangenome increases unboundedly and (ii) Bpan was calculated as 0.16. In the pangenome, core and accessory genes account for 30.77% and 69.10%, respectively.
Phylogeny based on the pan (Fig. 6C) and core (Fig. 6C) genome demonstrated that all the B. glycinifermentans strains belong to the same clade and B. sonorensis strain SRCM101395 is closer to B. glycinifermentans species. This finding reveals a lack of diverse genetic evolution of the pan and core genome in the four-strain of B. glycinifermentans sp.
Fig. 6
Figure 7, showing the heat map and the phylogenetic tree created by hierarchical clustering using Manhattan distance matrix based on the presence (red) and absence (blue) genes. Clade I include species of B. subtilis and B. velezensis, which exhibits high diversity at genome level with reference to Clade II and Clade III. The genomes in clade III show less diversity compared to clade I and clade II.
Fig. 7
Comparison among B. glycinifermentans
We also computed the orthologues genes for six strains of B. glycinifermentans sp. including JRCGR-1, KBN06PO3352, BGLY, SRCM103574, KJ17 and GO13 (Fig.8). Fig. 8A shows the graphical representation of orthologues genes shared between strain JRCG-1 and other B. glycinifermentans strains; from the outer circle inwards; 83 scaffolds of strain JRCG-1 (1); genes (5074) of strain JRCG-1 (2); orthologues genes (4670) assigned to strain JRCG-1 (3); orthologues genes (4196) shared between JRCG-1 and BGLY strain (4); orthologues genes (4018) shared between JRCG-1 and KBN06PO3352 strain (5) orthologues genes (4274) shared between JRCG-1 and SRCM103574 strain (6). unassigned genes (375) of strain JRCG-1 to any orthogroups (7).
A total of 3148 genes were shared by these six strains and only 30 genes were strain-specific (Fig. 8B). Thirteen strain-specific genes were found for strain JRCG1, five for strain SRCM103574, three for strain BGLY, one for strain KJ17, four for both BN06P03352 and GO13 strains.
Fig. 8