Microorganisms harboring reductive dsrAB
We identified 50 DsrAB protein sequences in 982 genome bins recovered from 18 metagenomic datasets in the revegetated acidic mine wasteland [30]. 16 of these sequences belonged to the reductive bacterial-type DsrAB family (Figure S1). Accordingly, 16 reductive dsrAB-containing genome bins were retrieved, with 12 from Acidobacteria and four from Deltaproteobacteria (Fig. 1 & Table S1).
The 12 Acidobacteria genome bins were all affiliated to subdivision 1 of Acidobacteria (Figure S2). Among them, three (UT3_2.bins.71, UT3_3.bins.87 and UT4_3.bins.137) formed a monophyletic clade and had average 63% AAI to their closet relative Granulicella tundricola MP5ACTX9. Similarly, ULRT4_2.bins.48 formed a monophyletic clade and had 56%-62% AAIs to its closest relatives. These four genomes represented two genera not previously reported to contain SRMs, given that no SRMs from Acidobacteria have been successfully cultivated and that currently known Acidobacteria-related genome bins containing dsrAB [15] were affiliated to other genera (Figure S2).
Three out of the four genome bins from Deltaproteobacteria were affiliated to the well-known SRM genus Desulfovibrio (Table S1). In contrast, another one (i.e. ULRT4_2.bins.61) can be only assigned to Syntrophobacteraceae and formed a monophyletic clade (Figure S3). It had 54–60% AAIs to its closest relatives. Therefore, we inferred that it belonged to a new genus.
The Dsr operon structures of Acidobacteria were different from those of Deltaproteobacteria (Fig. 1). Multiple alignments of DsrD and DsrT sequences with published references confirmed highly conserved residues (Figures S4 & S5), indicating that these proteins are likely active. According to the rules for determination of direction of dissimilatory sulfur metabolism for uncultivated microorganisms [14], 11 genome bins (eight from Acidobacteria and three from Deltaproteobacteria) recovered in our study encoded the complete pathway for reduction of sulfate to sulfide (Fig. 1 & Table S2). Notably, seven genome bins (six Acidobacteria and one Deltaproteobacteria, Table S1) had a completeness > 90% and contamination < 10%.
Glycoside hydrolysis of SRMs
71 GH families were encoded by the nine SRM genomes considered in this study (Table S3). Among them, only one (i.e. GH 50) was not found in the six near-complete Acidobacteria-related genomes. This result is reasonable, as GH50 currently consists of β-agarase (EC 3.2.1.81), which is responsible for hydrolysis of (1→4)-β-D-galactosidic linkages in agarose (a polysaccharide produced by some aquatic red algae) [59]. In contrast, the three Deltaproteobacteria-related genomes encoded only 10 GH families. The numerical predominance of GH families observed for Acidobacteria-related genomes is remarkable even as compared to the results of Eichorst et al. [44] who identified 131 GH families in 24 non-SRB Acidobacteria-related genomes. According to the average number of GH genes (per genome), GH3, GH13, GH23, GH2, GH31, GH29, GH28, GH27, GH92 and GH35 were the top 10 most abundant ones across all investigated genomes (Fig. 2). Except for GH23, these abundant GH families were largely represented by Acidobacteria-related genomes. A striking example was GH3, which was encoded by 9–13 genes in each Acidobacteria-related genome but by only one gene in per Deltaproteobacteria-related genome (Fig. 2).
Hydrogen metabolism of SRMs
Genes encoding eight groups of hydrogenases (including Group A1, A2 of [FeFe]-hydrogenase and Groups 1a, 1b, 1d, 3d, 4c, 4e of [NiFe]-hydrogenase) were identified in this study (Table S4). Among them, [FeFe]-hydrogenase was encoded only by D. vulgaris. These results seem to agree with those of Hausmann et al. [15] who showed that genome bins of Acidobacteria-related SRMs harbored genes encoding Groups 1 (excluding 1 h), 3, and 4 of [NiFe] hydrogenase. When individual genomes were taken into account, they differed considerably in the total number of hydrogenase genes (Table S4). The genome of D. vulgaris contained up to seven hydrogenase genes, whilst two Acidobacteria genomes (i.e. UT4_3.bins.137 and UT3_2.bins.71) lacked such genes. Despite this, seven out of the eight genes encoding oxygen-tolerant hydrogenases (i.e. Groups 1d, 3d of [NiFe]-hydrogenase) [49] were identified in the remaining four Acidobacteria genomes (Fig. 2).
Respiratory chain of SRMs
All investigated genomes encoded the major components of respiratory chain (Table S5). Specifically, the (near) complete operons for NADH dehydrogenases 1 (lacking in D. vulgaris and D. multivorans), NADH dehydrogenases 2 (lacking in UT4_3.bins.137 and UT3_2.bins.71), succinate dehydrogenase, quinol–cytochrome-c reductase, high-affinity terminal oxidase, low-affinity terminal oxidase (lacking in D. multivorans) and F-type ATP synthase were detected. Remarkably, high-affinity bd-type terminal oxidase was prevalent in all genomes, while high-affinity cbb3-type terminal oxidase was only detected in ULRT3_2.bins.110 (Table S5).
MCP system of SRMs
Two Desulfovibrio-related genomes (i.e. ULRT4_3.bins.101 and D. vulgaris) encoded much more MCPs than the other seven genomes (Fig. 2 & Table S6). Surprisingly, no MCP genes were detected in ULRT3_2.bins.110 and D. multivorans. While the majority of MCPs encoded by the two Desulfovibrio genomes belonged to class Ia (clusters I and II; Table S7 & Figure S6) that contained two experimentally validated redox and oxygen sensors (i.e. DcrA and DcrH) [52], acidobacterial MCPs were mainly from classes IVa and IVb. More specifically, three out of the five Acidobacteria genomes lacked genes encoding class Ia MCPs (Table S7).
The entire set of genes for core chemotaxis signaling complexes (CheB, CheR, CheW, CheA and CheY) [51] were detected in three acidobacterial genomes (i.e. ULRT4_1.bins.77, ULRT4_3.bins.75 and ULRT4_2.bins.48) and two Desulfovibrio genomes (i.e. ULRT4_3.bins.101 and D. vulgaris; Fig. 2 & Table S6). Additionally, the che operon structure in ULRT4_3.bins.101 was the same as that of cheA3 in D. vulgaris (Figure S7), suggesting it had the ability to sense electron acceptors like sulfate and lactate [60].
Flagellum system of SRMs
The entire set of 24 core flagellar genes [53] was identified in four acidobacterial (i.e. ULRT4_1.bins.77, ULRT4_3.bins.75, ULRT3_2.bins.110 and ULRT4_2.bins.48) and two Desulfovibrio-related genomes (i.e. ULRT4_3.bins.101 and D. vulgaris; Fig. 2 & Table S6). Genes coding for highly conserved components of type IV pilus (e.g. pilA) were identified in UT4_3.bins.137 and UT3_2.bins.71 (Table S6), indicating that these two strains may move towards chemoattractants using pili-based “twitching” motility [54]. These results, together with those on MCP system, suggested that five investigated genomes (i.e. ULRT4_1.bins.77, ULRT4_3.bins.75, ULRT4_2.bins.48, ULRT4_3.bins.101 and D. vulgaris) have the potential to utilize flagella-driven chemotaxis to sense surrounding chemoattractants and relocate themselves towards favorable microenvironments.
Antioxidative enzymes of SRMs
Among the four known enzymes involved in oxygen reduction by SRMs [24], only cytochrome bd oxygen reductase (Cbo, EC 7.1.1.7) was encoded by all the nine genomes (Fig. 2). The other three enzymes showed two contrastive patterns: (1) [Fe] hydrogenase (EC 1.12.7.2) and rubredoxin-oxygen oxireductase (ROO) were encoded largely by Deltaproteobacteria-related genomes; and (2) cytochrome c oxidase (Cco, EC 7.1.1.9) occurred mainly in Acidobacteria-related genomes (Fig. 2). Similarly, two opposite trends were observed for the two major enzymes responsible for eliminating superoxide anion radicals [24]: (1) all the investigated genomes encoded at least one SOD (EC 1.15.1.1), although the type of SOD differed between genomes; and (2) superoxide reductase (SOR, EC 1.15.1.2) genes were present only in Deltaproteobacteria-related genomes (Fig. 2). Note that the majority of Acidobacteria-related genomes lacked genes encoding catalase (EC 1.11.1.6) and thioredoxin peroxidase (Tpx, EC 1.11.1.15), whilst they harbored more genes encoding thioredoxin-dependent peroxiredoxin (BCP, EC 1.11.1.24), cysteine synthase (CysK, EC 2.5.1.47) and glutathione peroxidase (GPX, EC 1.11.1.9) than those of Deltaproteobacteria-related SRMs.
Viruses of SRMs
Six viruses (prophages) were identified across the 12 acidobacterial genome bins recovered in this study (Fig. 3 & Table S9), while no virus sequences were detected in the four deltaproteobacterial genome bins. Seven and three viruses were found in D. vulgaris and D. multivorans respectively, which seemed to be not in complete agreement with previous findings that eight and no viruses were identified in the two model SRMs correspondingly [47, 61]. This discrepancy likely resulted from the utilization of different viral prediction methods. Notably, PFAM annotations revealed that 11 out of the 16 viruses identified in this study harbored at least one virion-associated gene (Table S9), suggesting that these viruses still have the genetic potential to complete a lytic cycle [44].
Most of the identified 16 viruses could not be clustered with isolated viruses or those identified in publicly available microbial genomes or metagenomes using a gene-content based classification (genus-level grouping) [57], although half of them could be tentatively assigned to the order Caudovirales (Table S9). Specifically, four acidobacterial viruses formed an exclusive cluster, while the remaining two acidobacterial viruses were not closely related at the nucleotide level to any previously sequenced bacteriophages (i.e. singletons [62]; Table S9 & Fig. 3). Similarly, among the viruses of D. vulgaris and D. multivorans, one was affiliated to Myoviridae family, three were clustered exclusively, and the remaining six were singletons (Table S9 & Fig. 3). Additionally, the abundances of the viruses targeting acidobacterial SRMs were positively correlated with their host abundances (Figure S8).
Roles of viruses in glycoside hydrolysis of SRMs
Three genes encoding GHs were recovered from viral scaffolds, which were further predicted via three-dimensional protein structural modeling (Fig. 4 & Table S10). Among them, one was from the virus infecting ULRT3_2.bins.110 and encoded D-4,5-unsaturated β-glucuronyl hydrolase (EC 3.2.1.172), which is able to release rhamnose from rhamnogalacturonan I oligomers (a major component of plant cell wall [63]; Fig. 4). The remaining two genes were identified on viral scaffolds D. vulgaris.2 and D. vulgaris.5, both of which encoded endochitinase (EC 3.2.1.14). This enzyme can cleave chitin randomly at internal sites, generating soluble low molecular mass multimers of N-acetyl-D-glucosamine, such as chitotetraose and chitotriose (Fig. 4) [64].
Roles of viruses in chemotaxis and antioxidation of SRMs
Three MCPs were encoded by viral scaffolds D. vulgaris.1 and D. vulgaris.2 (Fig. 4 & Table S11). Among them, two belonged to class Ia (cluster II) with double cache-like sensor domains, while the other one belonged to class Ia (cluster I) with single cache2 domain. According to previous findings [65] and the ligands confirmed in model protein structures, lactate and C2/C3 carboxylates (e.g. sodium acetate) could be the ligands for these MCPs (Fig. 4). One the other hand, one gene encoding Ni-containing SOD was identified on viral scaffold D. multivorans.2 (Fig. 4 & Table S11).