Identification of bacteriocins by S. aureus genome-wide screening
The genome of S. aureus NCTC 8325 contains 487 small ORF-encoding mini-proteins with an amino acid sequence length of 100 or less. The bacteriocins are small peptides with antimicrobial activity as well as have signal sequences for extracellular secretion. Therefore, amino acid sequences of the mini-proteins were screened for antimicrobial activity as well as for subcellular localization using in silico methods. The screening of the amino acid sequence of the mini-protein in Antimicrobial Peptide Scanner vr.2 servers identified 51 peptides with predicted antimicrobial activity. Amino acid sequence base analysis of sub-cellular localization indicates only 13 out of the 51 antimicrobial peptides are secreted to extracellular environments like bacteriocins. Further, motif prediction indicates many of the extracellular peptides have known functional motifs, which are different from bacteriocin and, therefore, removed from the list. The final list of bacteriocin contains seven hypothetical mini-proteins (Table 1). Analysis of orthologues genes in the sequence database indicates only three of the seven bacteriocins, SAOUHSC_01507, SAOUHSC_00977 and SAOUHSC_A02794 are conserved among eight, ten and nine different other S. aureus strains respectively; whereas, rest of the bacteriocins, like SAOUHSC_01111, SAOUHSC_01544, SAOUHSC_01118 and SAOUHSC_00839 are almost unique for S. aureus NCTC 8325 strain only.
Table 1
List of bacteriocins identified by screening S. aureus NCTC8325 mini-protein library and their physicochemical properties.
Kegg Accession No. | NCBI Acc No | UniProt | amino acid sequence | sequence length | Molecular Weight (Da) | pI | Net Charges | Grand average of hydropathicity (GRAVY) | instability index | Aliphatic index |
SAOUHSC_01111 | 3920753 | Q2FZC1 | MLKTVKSTRLLGIKKHTELNSCTPNVRKQC | 30 | 3428.13 | 10.21 | 6.25 | -0.57 | 31.24 | 84.33 |
SAOUHSC_01544 | 3920587 | Q2FYC0 | MNLMQHWKQTGLNPTARKERGVVSFVASMLHHQK | 34 | 3961.64 | 11.1 | 4.75 | -0.632 | 7.76 | 65.88 |
SAOUHSC_00839 | 3918951 | Q2FZZ5 | MLGPRQLAHYCKLTYHQLLCWGPAIIEKFVIGVFSFG | 37 | 4238.1 | 8.8 | 2.5 | 0.562 | 44.2 | 105.41 |
SAOUHSC_01507 | 3919049 | Q2FYE6 | MTIFTQLSDRIKKAISKINQANNIPNNARKSPMYHSPIT | 39 | 4443.16 | 10.56 | 5.25 | -0.626 | 39.22 | 77.69 |
SAOUHSC_00977 | 3920122 | Q2FZM3 | MRYVTMRQFIKRTVKTILVGYVIKFIRNKLSGKSSHPTDNKHN | 43 | 5107.08 | 11.17 | 9.5 | -0.551 | 28 | 81.4 |
SAOUHSC_01118 | 3920719 | Q2FZB6 | MGWGPTQKLAKSQHTKMCKLAGPQQREIGSPIPTNNASWRGPQHRSWRKVSLQKCASWRGPNIEKLEPQFLQTMQVGVGHR | 81 | 9188.61 | 11.04 | 10.75 | -0.941 | 45.29 | 54.2 |
SAOUHSC_A02794 | 3921713 | Q2FUZ5 | MIIMRGYNGRYCSWTNTAIQCDYFNKVRQVKKDGKSDRLSNNINSNIIYRVLEKCITLYNVCSRCKRALILSEHSKKNIFNIKKAV | 86 | 10095.84 | 9.89 | 12.25 | -0.433 | 29.33 | 88.37 |
The amino acid sequence-based physicochemical characterization of the bacteriocins is reported in Table 1. All the bacteriocins have a net positive charge. The majority of the known bacteriocins are reported to have a net positive charge, which is important for their function to disrupt negatively charged microbial cell envelopes. The hydrophobicity of the bacteriocins was predicted with the Grand average of hydropathicity index (GRAVY). A negative value of GRAVY indicates the hydrophilic and globular nature of any peptide, whereas, a positive GRAVY value indicates the hydrophobic and membrane-associated nature of a peptide. The GRAVY analysis indicates that only SAOUHSC_00839 is membrane associated with a low positive GRAVY value, rest of all bacteriocins are globular with negative GRAVY value. The in vitro as well as in vivo half-life of the bacteriocins are predicted using the instability index. All of the bacteriocins have longer half-life as the instability index is less or close to a value of 40 [18]. Like the instability index, another physicochemical parameter, the aliphatic index indicates the thermo-stability of globular proteins. The aliphatic index of a peptide is calculated as the volume occupied by amino acids with an aliphatic side chain. Among all the bacteriocins, SAOUHSC_01118 and SAOUHSC_01544 have an aliphatic index lower than 70, which indicates they are comparatively less thermostable [19]. The rest of the bacteriocins are comparatively thermostable with a high aliphatic index.
The analysis of sequence alignment of the bacteriocins in the APD3 AMP database identified similarity with known bacteriocins or AMPs. The SAOUHSC_00977 shows 35.5% similarity with toxin counterpart PepG1 of the staphylococcal type-1 toxin-antitoxin system [29]. The SAOUHSC_01111 and SAOUHSC_01544 show sequence similarity with Ranatuerin-1C (40.6%) and Hymenochirin-4B (35.3%) respectively found in skin secretions of frogs [30], [31]. Whereas, SAOUHSC_00839 shows 34.1% similarities with Mutacin, a bacteriocin secreted by Streptococcus rattus BHT. The SAOUHSC_01507 has 33.1% similarity with andropin, a male-specific AMP from fruit flies [32]. The SAOUHSC_A02794 has around 30% sequence similarity with uberolysin, a cyclic bacteriocin from Streptococcus uberis [33]. SAOUHSC_01118 shows 29% sequence similarity with bacteriocin Microcin M secreted by gram-negative bacteria like E. coli MC4100.
Analysis of upstream promoter sequence indicates stress and starvation-induced gene expression of new bacteriocins
The expression of the bacteriocin genes are majorly regulated by quorum sensing (QS) systems, as well as induced by stress or starvation [34]. The upstream region of the newly identified bacteriocin genes are analyzed to predict the presence of any regulatory elements. Upstream 200 bp DNA sequences were analyzed to identify regular − 35 and − 10 base pair AT-rich promoter sequences and associated regulatory elements or transcription factor binding sites. Several transcription regulator binding sites in the upstream promoter-operator region of the staphylococcal bacteriocins genes are identified, which indicates their active role in gene expression regulation (Table 2). LexA is a key regulator of DNA damage response. The LexA binding site is identified in the upstream promoter-operator region of the SAOUHSC_00839 gene, which indicates LexA regulated gene expression. The LexA is previously reported to regulate gene expression of known bacteriocins like colicins in E. coli [35], [36]. The upstream promoter-operator region of SAOUHSC_00839 has also been found to consist of binding sites of RpoH2 (σ32) and TyrR. The RpoH2 (σ32) is an alternative sigma factor associated with heat shock response, whereas, the TyrR is a transcription regulator of aromatic amino acid biosynthesis and transport. The binding site of RpoS (σ38), a starvation/stationary phase responsive sigma factor identified in the promoter-operator region of SAOUHSC_00977. The binding site of transcription regulator ArgR is identified in the promoter-operator region of SAOUHSC_00839, SAOUHSC_01507, and SAOUHSC_A02794 genes. The ArgR regulates the metabolism and transport of L-arginine and is well known for its role in acid-induced stress [37] and oxidative stress [38] adaptation. The promoter-operator region of SAOUHSC_A02794 and SAOUHSC_01111 contains the binding site of PhoB, which is a key player in phosphate homeostasis as well as reported to regulate gene expression under nutrient starvation and stress [39]. The binding sequence of MetR and GcvA are identified in the promoter-operator region of SAOUHSC_01118 and SAOUHSC_01544 respectively. The MetR and GcvA are LysR family transcriptional regulators also involved in starvation and stress response [40]. The MetR regulates the expression of genes associated with the mitigation of nitric oxide stress [41]. Therefore, expression of all the identified bacteriocin genes are found to be regulated by stress or starvation-associated transcription regulators.
Table 2
Analysis of upstream promoter sequence of the bacteriocin genes
Kegg Accession No. | Transcriptional regulator | Expression triggering factor |
SAOUHSC_01111 | narL, phoB3, | extracellular phosphate / nutrient starvation |
SAOUHSC_01544 | gcvA | Lack of Glycine/nutrient starvation |
SAOUHSC_00839 | lexA, argR2, tyrR rpoH2 | heat shock, acid induced stress, lack of aromatic amino acid/nutrient starvation |
SAOUHSC_01507 | rpoD17, argR2 | acid induced stress |
SAOUHSC_00977 | rpoS17 | starvation/stationary phase |
SAOUHSC_01118 | metR | Nitric oxide stress |
SAOUHSC_A02794 | argR, ihf, phoB | acid induced stress, extracellular phosphate / nutrient starvation |
Secondary structure prediction and ab initio modeling show an amphiphilic helix-rich structure of the bacteriocins
To analyze the individual structures of the identified bacteriocins, amino acid sequence-based secondary structure prediction identified an abundance of α-helices (Fig. 1). The three-dimensional structure models of the peptides were predicted using ab initio modeling server AlphaFold [23] and Rosetta server [24] or template-guided modeled using I-TASSER server [26]. The structures were validated with the Ramachandran plot [27]. The modeled structures have around 80% of all the residues in the most favorable region in the Ramachandran plot and no outlier, which indicates good model quality.
The SAOUHSC_01544 structure obtained from AlphaFold server has two α-helices connected with a hinge region (Fig. 2A). The flexible hinge 10TGLN13 in SAOUHSC_01544 connects the two helices. The structure of many well-known AMPs consists of two amphipathic α-helices connected with a glycine-rich flexible hinge. An example is ceropins, which show antimicrobial activity against gram-positive, and gram-negative bacteria as well as fungus. The mode of action of cecropins is found to be permeabilizing cell membrane by channel-like pores formation [42], which results in osmotic imbalance across the membrane. Moreover, the amphiphilicity of the helices enhances the antimicrobial ability of the cecropins [43]. Both the N-terminal and C-terminal helices in SAOUHSC_01544 are found to be amphiphilic with hydrophobic moment (µH) of around 0.47 and 0.27 respectively. A partially modeled structure of SAOUHSC_01507 in AlphaFold shows an N-terminal helix and a partially built C-terminal helix connected with a loop (Fig. 2B). The N-terminal α-helical part of SAOUHSC_01507 is amphiphilic with a hydrophobic moment (µH) of 0.47 and a net positively charged, which may be important for its membrane interference. The secondary structure prediction also shows coil structure in the C-terminal part of SAOUHSC_01507 (Fig. 1). The SAOUHSC_01507 has amino acid sequence similarity with Andropin, a male-specific AMP from fruit flies [32]. The Andropin is active against gram-positive bacteria only. Although, no experimentally determined structure of Andropin is available, however, AlphaFold driven structure of Andropin shows two α-helices connected with a flexible loop. Therefore, both SAOUHSC_01544 and SAOUHSC_01507 consist of helix-turn-helix structures which may get further stabilized in the lipid membrane environment and cause membrane permeabilization by toroidal pores formation.
The SAOUHSC_00977 appears α-helical when modeled with AlphaFold server with unstructured C-terminal parts (Fig. 2C). The SAOUHSC_00977 has sequence similarity with PepG1, a toxin part of type-I toxin-antitoxin system from S. aureus. The NMR structure of S. aureus PepG1 (PDB ID: 7NS1) is comprised of a long transmembrane α-helix and a cytosolic C-terminal part with positively charged residues [44]. The structural alignment of PepG1 and the SAOUHSC_00977 indicates a similar structure. The N-terminal helical part of SAOUHSC_00977 is amphiphilic with a hydrophobic moment (µH) of 0.31 and has a high net positive charge. The C-terminal of SAOUHSC_00977 is predicted to be unstructured by secondary structure prediction as well as in modeling. The twelve amino acid long C-terminal unstructured part of SAOUHSC_00977 contains positively charged residues at the end similar to PepG1, which are important for the anchoring of the peptide to the negatively charged membrane [45, p. 1].
SAOUHSC_01111 has sequence similarity with the antimicrobial peptide Ranatuerin. The only available structure rantuerin-2csa (PDB ID: 2K10) shows a helix-turn-helix structure [46]. The ab initio model of SAOUHSC_01111 shows a helix-turn-helix like structure (Fig. 3A), which is similar to rantuerin-2csa (Fig. 3B). In a polar environment, these peptide lacks any secondary structure, however, in a nonpolar environment like in a membrane it adopts helix–turn–helix structure. Moreover, the ranatuerin-2 family of peptides show a C-terminal six amino acid cyclic peptide structure connected with a disulfide bridge. A similar C-terminal disulfide bridge mediated by nine amino acids cyclic peptide structure is indicated by the presence of two Cystine residues (Cys22 and Cys30) in SAOUHSC_01111. The mode of action of the ranatuerin-2 family peptides is pore formation by a ‘Carpet-like model’, where, the peptides arrange themselves parallel to the membrane plane and then destabilize the membrane in a similar way as detergent. The high net positive charge and amphiphilic nature of ranatuerin-2 family peptide facilitates carpet-like membrane interaction. The SAOUHSC_01111 is also found to be amphiphilic with an N-terminal helix having a hydrophobic moment (µH) of 0.23 and the C-terminal part having a hydrophobic moment (µH) of 0.41 with an overall net positive charge.
The AlphaFold generated model structure of SAOUHSC_00839 contains N-terminal beta-hairpin and a C-terminal α-helix (Fig. 4A). The antiparallel stands of the beta-hairpin in SAOUHSC_00839 are cross-linked with an intramolecular disulphide bond formed between Cys 11 and Cys 20. The C-terminal α-helix is amphiphilic (µH = 0.35). Pediocin-like bacteriocins Leucocin-A (PDB ID: 1CW6) have a similar structure with N-terminal antiparallel β-sheet stabilized by disulfide bridge and C-terminal amphiphilic α-helix (Fig. 4B) [47]. The type-IIa bacteriocins have a conserved N-terminal YGNGV/L motif which is completely absent in SAOUHSC_00839 and thus SAOUHSC_00839 may belong to a novel sub-group (Fig. 4C). However, the interface of β-sheet and α-helix contains a conserved motif WGXA in both Leucocin-A and SAOUHSC_00839. The Pediocin-like bacteriocins show antimicrobial activity by permeabilizing the membrane of the target cells. The positively charged N-terminal β-sheet electrostatically interacts with the negatively charged cell membrane of the target cell. The C-terminal amphiphilic α-helix perforates the hydrophobic core of the membrane and causes leakage. The amino acid sequence of the C-terminal α-helical part is highly variable among Pediocin-like bacteriocins and therefore, is believed to be important for target specificity.
The SAOUHSC_01118 and SAOUHSC_A02794 have sequence similarities with type II circular or leaderless bacteriocins like microcins and Uberolysin respectively. Ab initio modeling of SAOUHSC_01118 in the Rosetta server shows α-helical bundle structure (Fig. 5A). Many of the known circular or leaderless bacteriocins show conserved structural features called saposin-like folds found in bacteriocins like AS-48 (PDB ID: 1E68) (Fig. 5B). Saposin D is a human peptide that interacts with anionic lipids. The saposin-like fold containing proteins possess a globular structure consisting of four or five helices. A characteristic feature of a saposin-like fold is a ‘v-shape’ formed by two helices and a third helix perpendicular to the ‘v-shape’. [48]. The SAOUHSC_A02794 modeled in Rosetta shows a unique structure consist both α-helices and β-sheet. The three β-stands from the N-terminal and one β-stand from the C-terminal form an antiparallel β-sheet in SAOUHSC_A02794 (Fig. 5C). The N-terminal β-sheet and C-terminal helix SAOUHSC_A02794 form intramolecular disulfide bridges as presumed from the presence of two and three Cysteine residues respectively, which would provide thermal stability to the bacteriocin and important for its function in extreme extracellular environment.
The leaderless and circular bacteriocins may interact with the membrane directly or through membrane-bound receptors. Moreover, unlike other globular proteins, leaderless and circular bacteriocins have solvent-exposed aromatic residues in N- or C-termini ends, which play an important role in membrane interaction [48], [49]. The SAOUHSC_01118 has four surface exposed Trp3, Trp39, Trp47 and Trp58 residues and SAOUHSC_A02794 has N-terminal surface exposed Tyr7, Tyr11 Trp14 and Tyr23 (Fig. 5A and 5C). These aromatic amino acids facilitate the interaction between the bacteriocin and lipid membrane and may have an important role in membrane disruption.
A molecular dynamics simulation study identified the structural plasticity of the helix-turn-helix bacteriocins
The ab initio models of the bacteriocins are subjected to molecular dynamic simulation to validate structural stability in the water environment and examine the structural dynamic. The RMSD value of all the simulations was below 2Å, which indicates the accuracy of the initial model (Fig. 6A). All the modeled structures of the bacteriocins are found to be stable in a water environment and equilibrated well by 2 to 4 ns of simulation. Other than the helix-turn-helix bacteriocins, all the other structures maintained stable RMSD for up to 50 ns. The amphiphilic helix-turn-helix bacteriocins, SAOUHSC_00977, SAOUHSC_01507, SAOUHSC_01544, and SAOUHSC_01111 show comparatively higher structural changes as apparent from the fluctuating RMSD values. Furthermore, no abrupt change in the radius of gyration (Rg) in any of the model structures of bacteriocins was identified during the simulation (Fig. 6B). The Rg plot of the peptides indicates that the globular nature of the structure has been maintained throughout the simulation time course. During simulation, the amphiphilic helix-turn-helix peptides are found to go through cycles of partially unfolding and refolding, which is reflected in their Rg polt as fluctuations. Solvent accessible surface area and overall H-bonding of all the peptides were mostly stable during a simulation run, which is also in argument with the Rg plot indicating structural interiority of the bacteriocin models (supplementary Figure S1 and S2). Analyzing the structural flexibility using RMSF value identified that bacteriocins with longer polypeptide chains like SAOUHSC_A02794 and SAOUHSC_01118 have less structurally disordered regions in their structure (Fig. 6C). Structural flexible regions in most of the helix-turn-helix bacteriocin structures are identified in the N- and C-terminal regions. The hinge regions in the helix-turn-helix bacteriocin structures are found to have comparatively higher RMSF values indicating structural flexibility.
The initial starting models of the bacteriocins are superimposed with the simulated energy-minimized structure for structural comparison (Fig. 7). The bacteriocins SAOUHSC_00977, SAOUHSC_00839, SAOUHSC_01544, SAOUHSC_01507, and SAOUHSC_01111 with amphiphilic helix-turn-helix are found to have gone through a significant structural change during simulation, which indicates their conformational plasticity. Amphiphilic helical bacteriocines or antimicrobial peptides are reported to be unstructured in a water environment, and adopt the helical or helix-turn-helix structure only in the water-lipid interface [50]. The hinges and C-terminal helix in SAOUHSC_00839 are found to have comparatively higher structural flexibility. The loop Thr14 to leu18 connecting the β-hairpin structure shows the highest RMSF value. However, the disulfide bond between the two stands stabilizes the β-hairpin structure. The structural stability of the β-hairpin indicates its importance in the functional activity of SAOUHSC_00839. The SAOUHSC_01118 and SAOUHSC_A02794 hold the globular shape during the simulation and the saposin-like characteristic fold remains intact. The N-terminal α-helix in SAOUHSC_01118 is found to become more structured and elongated out words by increasing helix length including four N-terminal amino acid residues. With an elongated N-terminal helix, the Trp3 residue in SAOUHSC_01118 seems important for interaction with the lipid membrane extended out of the globular structure (Fig. 7).