Identification of DEGs in mutated bacteria
Using several R packages and setting FDR<0.05 and a |logFC| >1, 806 DEGs were screened between the wild-type and ctrA deleted bacteria in GSE35625, including 428 up-regulated genes and 378 down-regulated genes. A density plot and a volcano plot were shown in Fig1. Additionally, 6 Unannotated ORFs were also screened. A deg.txt file of identified DEGs was created using R and can be found in Supplementary data FileS1.
(Fig1 approximately here)
GO and KEGG pathway enrichment analysis
759 DEGs were successfully converted to standard locus tags, and the rest were removed due to sequence redundancy in 2015 in the NCBI databases. All 759 DEGs excluding 6 unannotated ORFs were submitted to GSEA-Pro v3 to perform the GO and KEGG pathway enrichment analysis. There were three parts of the GO enrichment analysis: biological process (BP), cellular component (CC) and molecular function (MF). In general, the DEGs were strongly enriched in diverse BPs, such as translation (GO:0006412), protein repair (GO:0030091), ribosome assembly (GO:0042255), cellular respiration (GO:0045333), chemotaxis (GO:0006935), pathogenesis (GO:0009405), and bacterial-type flagellum-dependent cell motility (GO:0071913). The DEGs were strongly enriched in multiple CCs, such as cytoplasm (GO:0005737), large ribosomal subunit (GO:0015934), plasma membrane respiratory chain complex 1 (GO:0045272), and bacterial-type flagellum basal body (GO:0009425). The DEGs were also strongly enriched in many MFs, for instance rRNA binding (GO:0019843), endoribonuclease activity (GO:0004521), translation initiation factor activity (GO:0003743), structural constituent of ribosome (GO:0003735), ATP binding (GO:0005524), metal ion binding (GO:0046872), motor activity (GO:0003774), and etc..
We also submitted up-regulated and down-regulated genes respectively to GSEA-Pro v3. The up-regulated DEGs were strongly enriched in most of these aforementioned terms, such as translation (GO:0006412), ribosome (GO:0005840), and rRNA binding (GO:0019843). Moreover, the up-regulated DEGs were also particularly enriched in both energy conversion and molecule biosynthesis plus various enzyme activities, for instance UDP-glucose metabolic process (GO: 0006011), ribosome biogenesis (GO:0042254), undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase activity (GO:0099621), and UTP:glucose-1-phosphate uridylyltransferase activity (GO:0003983).
By contrast, the down-regulated DEGs were particularly enriched in several biological processes and transfer or transportation functions, such as DNA replication (GO:0006260), sodium ion transport (GO: 0006814), acetate transmembrane transporter activity (GO:0015123), glycolate transmembrane transporter activity (GO:0043879), 3-deoxy-manno-octulosonate cytidylyltransferase activity (GO:0008690). The down-regulated DEGs were also enriched in plasma membrane (GO:0005886) and bacterial-type flagellum basal body (GO:0009425).
KEGG pathway analysis showed that the identified 759 DEGs were strongly enriched in the multiple pathways of energy conversion, molecular biosynthesis and environmental sensory activity in general, such as glycerophospholipid metabolism (mag00564), RNA polymerase (mag03020), and two-component system (mag02020). Particularly, two enriched pathways were specifically identified for the up-regulated DEGs, including glyoxylate and dicarboxylate metabolism (mag00630), ubiquinone and other terpenoid-quinone biosynthesis (mag00130). These pathways are related to a diversity in biosynthesis of amino acids and molecular machines and energy conversions. Among the enriched pathways, bacterial flagellar assembly (mag02040) and chemotaxis (mag02030) are directly related to bacterial movement.
13 common magnetosome-related genes in the CtrA regulon were identified when we compared 759 DEGs with known lists of magnetosome-related genes, including mamF-like (amb0412), mamD-like (amb0400), mamJ (amb0964), mamS (amb0975), mamU (amb0977, amb2502), mamW (amb0981), mamQ2 (amb1005), ftsZm (amb1015), mamX (amb1017), mamY (amb1018), mms5 (amb1027), magA (amb3990). Except mamJ, mamW and mms5, other 10 genes were recognized and assigned to 5 GO terms (integral component of membrane, cytoplasm, phospholipid biosynthetic process, ATP binding, metal ion binding) using GSEA-Pro v3. Furthermore, 11 types of genes clusters (COG, GO, KEGG, Operon, PFAM, InterPro, Superfamily, Keywords, eggNOG_COG, ENOG, Regulon) were obtained via GSEA-Pro v3. We compared gene clusters which known magnetosome-related genes belong to, and identified multiple functional similar genes when these genes cooccurred in two or more clusters. These genes are shown in Table 1.
Table 1 Functional similar genes of known magnetosome-related genes.
Known Genes
|
Functional Similar Genes
|
mamF-like (AMB_RS02120, amb0412)
mamD-like (AMB_RS02045, amb0400)
mamS (AMB_RS04995, amb0975)
mamU (AMB_RS05005, amb0977)
mamU (AMB_RS12580, amb2502)
mamQ2 (AMB_RS05155, amb1005)
ftsZm (AMB_RS05200, amb1015)
mamX (AMB_RS05210, amb1017)
mamY (AMB_RS05215, amb1018)
magA (AMB_RS20195, amb3990)
|
amb0188, amb0273, amb0537, recA, amb0630, dnaA, amb0674, carA, kup1, amb0894, pstB2, amb1385, amb1428, amb1466, amb1551, amb1865, amb2097, amb2222, nrdR3, plsX, gyrA, coaD, nhaA, pyrH, lolD, amb2686, amb2770, gltX2, amb2976, amb3001, mnmA, hscA, anmK, ruvB, ruvA, hisG, amb3387, amb3618, amb3655, amb3712, amb3802, murF, murD, amb3924, amb4015, amb4061, amb4090, htpG, amb4368, amb4397, amb1062, amb2736
|
PPI network and cluster analysis
759 DEGs’ protein sequences were submitted to the STRING database, and a PPI network in AMB-1 was constructed to identify interesting genes and biological clusters that may play roles in magnetosome biogenesis in the CtrA regulon. Moreover, some nodes did not connect to other nodes in the networks. These nodes were removed or hided from the networks. This PPI network was visualized using Cytoscape as shown in Fig2. A total of 477 nodes and 4100 edges were screened from the PPI network. 9 of 13 known magnetosome-related genes were removed due to no connection with other nodes. Among 4 genes, mamJ has the highest node degree, while ftsZm is second highest, followed by 2 mamU nodes The cluster analysis of this PPI network using MCODE with default parameters indicated that most magnetosome-related genes do not belong to any cluster, except mamJ. A small cluster of mamJ, amb1158, amb1183, and amb0395 was found, and it is not known for magnetosome biogenesis. This result may suggest many magnetosome-related genes are not hub genes in a whole old network, and they can develop new abilities to interact with other genes during evolution, and consequently form new parts of the whole network (Fig3).
This network was analyzed using 6 methods (DMNC, MCC, Closeness, Radiality, Stress, Betweenness) supported by Cytoscape. The most important nodes are usually ribosome and translation-associated genes. Important genes having other types of functions can also be identified. Considering non-essential network status of magnetosome-related genes, we decided to search neighborhoods of known magnetosome genes in ranking lists. After searching and comparing 20 nearest neighbors of magnetosome-related gene, we identified 44 nodes, which are common nearest neighbors of magnetosome-related genes at least by 3 methods (Table 2). After being compared with the list of functional similar genes, 8 genes were commonly found in the two lists. These genes include dnaA, lolD, amb1551, amb0674, amb3712, amb4015, amb0273, and amb4397. Almost all genes have functions of ATP binding, whereas some of them have functions of regulating membrane properties. Besides, some genes also have functions of becoming cytoplasm components, metal ion binding, lipoprotein transport, and etc.
(Fig2 approximately here)
(Fig3 approximately here)
Table 2 Nearest neighbor nodes of known magnetosome-related genes in AMB-1. They were identified by at least 3 methods provided by CytoHubba.
Known Genes
|
Nearest Neighbors of Known Genes
|
mamJ (amb0964)
ftsZm (amb1015)
mamU (amb0977)
mamU (amb2502)
|
amb0049, amb0429, dnaA, amb0452, amb1551, amb3710, amb2771, amb2827, amb0475, purH, amb1609, amb3286, amb1381, amb3825, amb2831, lolD, amb1158, amb0674, amb0235, amb3235, msrQ, amb0086, amb0104, amb3712, amb0018, amb1483, amb3185, amb2878, amb3242, amb4015, amb0562, amb0140, amb0273, amb2859, amb2998, amb1127, amb1273, amb0365, amb1962, amb0495, amb3786, amb4397, amb1623, amb3910
|
Six methods include MCC, DMNC, Closeness, Radiality, Betweenness, and Stress.
WGCNA analysis and functional enrichment analysis in detected modules
To further search potential magnetosome-related genes, co-expression modules of 4508 genes in AMB-1 were also constructed using the WGCNA R software. When the soft-thresholding power value equals 15, the scale-free fit index is greater than 0.9 and the average connectivity degree wasn’t too low. Consequently, the WGCNA identified 21 co-expression modules using this power value. Some modules contain very large numbers of genes such as the blue and ivory modules. Known magnetosome-related genes were found not only in these two modules but also in other modules such as the steelblue, bisque4, darkred, and royalblue modules, highlighting a continuous time event of magnetosome formation. However, all 13 aforementioned magnetosome-related genes in the CtrA regulon were found in either the blue or ivory module.
The relationships among the co-expression modules are shown in a hierarchical clustering dendrogram of the eigengenes and a heatmap of the eigengene adjacency in Fig4 Panel (a) and Panel (b). The blue module has 15 known magnetosome-related genes in and outside the CtrA regulon, and this module has positive correlations with 8 modules such as the darkorange2, thistle1 and royalblue modules. However, this module are negatively related to the steelblue and ivory modules. By contrast, the ivory module is positively correlated with 8 modules such as the steelblue, salmon4, lightcyan1 and red modules. Most magnetosome-rated genes can be grouped into the blue-module genes (for example mamU, mamW, ftsZm, mamX, mamY, mamQ2, mamF-like) and ivory-module genes (for example mamJ, mamS, magA, mms5, mamD-like). Both the essential and non-essential genes can be found in these two groups.
(Fig4 approximately here)
GO and KEGG pathway enrichment analysis was also conducted using gene lists of the blue and ivory modules. However, no significant results were found for both modules, indicating extremely diverse functions of genes in these two modules. After comparing functional annotations of the genes in two modules, we found that they belong to completely same 19 GO classes including one or more classes for cellular process and signaling, information storage and processing, metabolism, and poorly characterized. However, multiple genes in the blue module have functions of binding ions of more diamagnetic metal species than those in the ivory module, although many genes in both modules have functions of binding to ferromagnetic, paramagnetic or diamagnetic metals. Furthermore, some genes in the blue module have functions of specifically binding to ADP and UTP, whereas more genes in both modules have functions of binding to ATP, GTP and cAMP. Thirdly, two genes in the blue modules have functions of binding to NADP+ or NADPH specifically, but a gene in the ivory module has functions of binding to NAD+. In addition, two genes in the blue module have functions of binding to small ribosomal subunit and RNA polymerase, and meanwhile two genes in the ivory module have functions of binding to 5S rRNA, an opponent of large ribosomal subunit, or mRNA.
There are significant differences in enzyme activities between the blue and ivory modules. Although two modules share 22 enzyme activities, the blue module has 48 unique enzyme activities, while the ivory module has 41 unique enzyme activities. Several kinase activities such as lipid kinase activity and riboflavin kinase activity were identified among these enzyme activities. Differences between the blue and ivory modules suggested energy conversion rate or related molecular movement might lead to boundaries between the blue and ivory modules. UTP can be biosynthesized from UDP by Nucleoside Diphosphate Kinase after using phosphate group from ATP. Conversion between NAD+ and NADP+ is catalyzed by NAD+ kinase or NADP phosphatase. An additional phosphate group is added or removed by these two enzymes. Small and large ribosomal subunits have different functions during the process of translation. The small subunit programs protein synthesis via binding to mRNA by hydrogen bonds, while the large one takes care of formation of peptides bonds.
GSEA-Pre v3 analysis was performed for closely correlated modules of the blue and ivory modules. Several similarities were found between the blue-close and ivory-close modules. For example, a common GO term between the thistle1 and royalblue modules is cytoplasm (GO:0005737), and this term is also shared among the steelblue, salmon4, and red modules. A common KEGG, i.e., two-component system (ko02020), was not only identified between the darkorange2 and royalblue modules but also among the steelblue, salmon4, red modules. In spite of these similarities, an obvious difference was also found between the blue-close and ivory-close modules using KEYWORDS classification. All blue-close modules shared a keyword, that is 3D-structure, whereas all ivory-close module shared not only the 3D-structure keyword but also an ATP-binding keyword. This finding highlights the importance of energy in defining boundaries of several co-expression modules as well.
Cytoscape input files of these co-expression modules were also created by the WGCNA, and the blue and ivory modules were visualized using Cytoscape as well. The evaluation of node properties showed that known magnetosome-related genes are likely ranked similarly by any method of CytoHubba. For example, amb1018 was ranked 979th while amb1005 was 981th out of 1345 genes by MCC method in ascending order in the blue module. Meanwhile, amb1027 was ranked 1053th whereas amb0964 ranked 1042th by Closeness method in the ivory module. We identified 104 genes who are nearest neighbors of all known magnetosome-related genes in the CtrA regulon in the blue and ivory modules totally (Table 3). We also compared this list with the aforementioned list of functional similar genes, and consequently identified 4 genes commonly shared between two lists. They are dnaA, plsX, amb0273, and amb3001. Both dnaA and amb0273 were also identified as commonly existing genes when the list of nearest neighbor nodes of magnetosome-related genes in PPI was compared with the same list of functional similar genes. They have functions in energy conversion or membrane structure. Finally, we got a list of 10 potential members of magnetosome-related genes based on the list of functional similar genes. They are dnaA, lolD, plsX, amb1551, amb0674, amb3712, amb4015, amb0273, and amb4397, amb3001. These genes are enriched in ATP binding (GO:0005524), plasma membrane (GO:0005886), and integral component of membrane (GO:0016021).
Table 3 Nearest neighbor nodes of known magnetosome-related genes in the CtrA regulon in WGCNA modules. They were identified by at least 3 methods provided by CytoHubba.
Known Genes
|
Nearest Neighbors of Known Genes
|
mamJ (amb0964)
ftsZm (amb1015)
mamU (amb0977)
mamU (amb2502)
mamW (amb0981)
mamX (amb1017)
mamY (amb1018)
mamS (amb0975)
mamQ2 (amb1005)
mms5 (amb1027)
magA (amb3990)
mamF-like (amb0412)
mamD-like (amb0400)
|
amb0081, amb0087, amb0137, amb0151, amb0163, amb0255, amb0257, amb0273, amb0353, amb0403, amb0444, amb0465, amb0492, dnaA, amb0677, amb0710, lipB, amb0736, amb0750, amb0756, amb0901, amb0903, amb0904, amb0929, amb0934, amb1031, amb1069, amb1074, amb1105, amb1113, amb1175, amb1282, amb1330, amb1358, amb1391, amb1445, amb1488, amb1489, amb1522, amb1531, amb1601, map, amb1640, amb1652, amb1719, amb1777, amb1857, amb1860, amb1869, amb1889, amb1967, amb1994, amb2026, amb2099, amb2179, amb2264, lipA, plsX, amb2426, amb2653, accC, amb2805, amb2925, amb2962, amb2975, amb2983, pxpB, amb3001, amb3006, msrP, rpsH, amb3273, amb3283, amb3284, amb3305, tnpB, amb3341, parE, amb3379, amb3449, miaA, amb3573, amb3580, amb3581, amb3688, amb3691, rsmA, amb3774, amb3814, dapE, amb3891, amb3928, accD, rplS, rlmH, amb4101, amb4256, amb4259, terL, amb4282, amb4317, amb4415, ald, amb4510
|
Six methods include MCC, DMNC, Closeness, Radiality, Betweenness, and Stress.
A known magnetosome-related gene, mamU (amb0977), in the CtrA regulon was identified as one of top 10 hub genes by the closeness, radiality, stress, and betweenness methods in the blue module. Furthermore, another magnetosome-related gene mamO (amb0969) outside the CtrA regulon was also one of top 10 hub genes identified by the betweenness method in the ivory module. Most magnetosome-related genes were not hub nodes in these two co-expression modules. We found that there was no significant GO term for top 10 hub genes in the blue and ivory modules, which were identified by at least 3 methods of CytoHubba. However, a term, ATP-binding (GO:0005524), was commonly found between two genes in the blue module and a gene in the ivory module. Some GO terms were comparable between the blue and ivory modules, such as cell wall (GO:0005618) and cell wall organization (GO:0071555), DNA binding (GO:0003677) and DNA repair (GO:0006281), or fatty acid biosynthetic process (GO:0006633) and polysaccharide biosynthetic process (GO:0000271).