Alkalotolerant Bacillus megaterium BHS1 with its ability to degrade many halogenated compounds especially a high concentration of 2,2 DCP (40 mM) under alkaline (pH 9) condition like in soda lake needs further investigation. As a result, it is important to study the protein structure with the main catalytic residues of DehLBHS1 that allow the bacterium to survive and adapt to a high concentration of 2,2 DCP under the extreme environment. The DehLBHS1 enzyme maintain its function at a high alkaline environment and remain sufficiently stable to bind to the degradable substrates. The 3D structure details describing the catalytic action of DehLBHS1 against the various substrates and structure dynamics were closely monitored. The 3D model of the enzyme was structurally predicted and attempt to illustrate the potential alkalophilic adaptations of novel DehLBHS1. Several characteristics of this enzyme demonstrate its structural stability. The data will provide valuable insights into the adaptation of the DehLBHS1 structure for the extension of the catalytic action of its substrate and its specificity in the future.
A detailed analysis of the novel protein DehLBHS1 is needed in order to understand the structure-function relationship of this enzyme that isolated from the high alkalinity environment. However, the absence of X-ray structure of alkaline-adapted dehalogenase from alkalophilic bacteria has made interpretation of the experimental data almost impossible. Therefore, using in silico method, the structure of DehLBHS1 was modelled to clarify the characteristics of this alkaline adapted protein.
Comparative Modeling of DehLBHS1
The protein structure folds should be predictable solely from its amino acid sequence but extremely difficult in practice. Alternatively, prediction by template-based or comparative modeling, uses the experimentally determined structure of a sequence relative (e. g. a homolog) as a template for the reconstruction of the 3D structure of the target protein. In this work, we investigate the protein's monomer subunit to explain the functional affinity of the enzyme towards haloacid compounds.
The sequence identity between DehLBHS1 and the PDB template is limited. In cases when the sequence identity is low, it has been proved that the sequence-structure alignment (threading) technique yields a more reliable model than the sequence–sequence alignment like in homology modeling technique. Similarly, using numerous models increases alignment coverage and improves model accuracy and quality [42]. As a result, the 3D structure DehLBHS1 was developed using a multi-template threading method on the I-TASSER [29]. I-TASSER employs various threading algorithms to provide structural templates. A set of five models was created and the proposed design was chosen based on the highest confidence score. As shown in Figure 1, the optimal model for DehLBHS1 has a C-score of 1.23 (range: −5 to 2), which is indicative of the 3D structures’ high quality. The prediction model DehLBHS1 has an estimated TM-score of 0.88±0.07 and RMSD of 3.1±2.2Å. The TM-score and Root Mean Square Deviation (RMSD) are frequently-employed standards for measuring the precision of structure modeling. It established a strong correlation between the C-score, the TM-score and RMSD, which is deemed a confidence score for gauging the predicted models’ quality via I-TASSER. This emphasises the 3D structure of DehLBHS1 folds in a favourable conformation (Figure 1).
In general, most dehalogenases were reported in dimeric form. The homodimeric proteins allow many functions, as improved stability and specificity of the substrate [43]. The location of functional domain or active site is away from the dimer interfaces in all experimental of 3D structure of L-2-haloacid dehalogenases [5,44]. In the in-silico investigation, the monomer version of the protein was also employed to establish the function of several active site residues [45]. In our study, DehLBHS1 was modelled in monomeric form using several low similarity templates. Interestingly, the topology of the enzyme structure is highly similar with other related haloacid dehalogenases (Figure 2). Protein sequences are not as conservative as their structures. Hence, many proteins can shear a similar fold even if they lack high sequence similarities.
Molecular Structure Refinement of DehLBHS1
Predicted models are not entirely reliable therefore, the energy minimisation is necessary to refine the enzyme model and to strengthen the structural model without major errors according their native state [46]. This requires to use Molecular Dynamic (MD) simulation that is helpful for the protein refinement [8,31]. Model refining raises the accuracy of homology-based and ab initio models [31] and minimising energy alleviates structural problems including steric collisions and static structure stresses. Conversely, the structure refined by MD simulation enables atoms to migrate within the boundaries of a specified force field in a bid to establish the global energetic minimum configuration. In the context of energy minimisation of the aforementioned optimal DehLBHS1 model (C-score 1.23), its stability during the MD simulation was gauged via RMSD backbone atoms as a time function, as illustrated in Figure 3. Following the energy minimisation, the DehLBHS1 structure stabilised at RMSD 0.25 Å and equilibrated throughout the simulation. This signifies that the reference structure has moved to some degree in order to accomplish stability. The equilibration has achieved within the plateau phase of simulation and molecular adaptation was seen over the simulation period. The 3D form of the DehLBHS1 has been evaluated on the basis of RMSD. This RMSD is to measure the average shift by displacing several atoms for a single frame in terms of the reference point. MD simulations are used not only for refinement, but to analyze biological functions represented in the protein conformation, new molecule shape, and structural features, by observing their internal motion [47].
The evaluation of the refined model’s quality are presented in Table 3. The Ramachandran plot for pre and post refinement model was depicted in Table 4 by Procheck. Additionally, Procheck performed stereochemical checks of the model, which depict the distribution of the phi and psi angles of the amino acid residues. This showed that 89.2% of the residues are located in the most favored regions, while 10.8% are in additional allowed regions, and none of residues are in both the generously allowed regions and the disallowed regions. This suggested the structure model of DehLBHS1 with acceptable quality since most of the residues are in the favorable region (Figure 4). The results of the Verify3D analyses showed that 83.5% percent of the amino acid residues had an average score of >2. Overall, 100% of the amino acid residues resided in favorite or allowable regions. Residues with a score exceeding 0.2 in Verify3D can be considered precise. The model determined satisfactory for proteins has a Verify3D ranking of over 80% [48]. The fact that the Verify3D pre and post minimisation models exceeded zero denotes that sufficient side-chain environments have been achieved in Table 3. All of the evaluation outcomes from the DehLBHS1 enhancement surpassed the minimum score limit, which signifies an effective 3D structure. Lastly, assessed by Errat based on non-bonded interactions, the overall model quality scored 96.17%. It is known that the spectrum for a successful protein model has been acknowledged when the ERRAT score >50 percent [48]. All the evaluation results from the refinement DehLBHS1 after exceeded the minimum cut-off score indicates a that the model has an acceptable stereochemical consistency and have good 3D structure.
Molecular docking of DehLBHS1 with haloalkanoic acids
The significant role of the dehalogenation process can be elucidated using the interaction of catalytic residues with halogenated compounds as substrate. The molecular docking with its computational consideration help to explain the relationship between the enzyme and the ligand molecule. In turn, offer somewhat insight about the enzyme degradative mechanism [49]. DehLBHS1 docking was performed using the AutoDock Vina software. Four different ligands of haloacids used for molecular docking with DehLBHS1 of Bacillus megaterium BHS1, which were 3-CP, 2,2-DCP, L-2CP, and D-2CP. Mostly, the relevant docking of the protein-ligand complex allows predicting the preferred orientation when linked together to form a stable protein-substrates complex. The lowest energy (kcal/mol) means represent highest affinity of the substrate towards the enzyme [50].
An overall observation, DehLBHS1-ligand complexes, the hydrogen bond distances have remained under acceptable distance limits for the creation of intermolecular hydrogen bonds (< Å 3.5) [51]. This reflects a higher affinity between the enzyme and substrate to become more tendency to catalyse the compound. Interestingly, the DehLBHS1-3CP complex showed the lowest binding energy (-3.8 kJ/mol), however none of the hydrogen bonds formed. Whilst the DehLBHS1 with all ligands (2,2DCP, 2-L-2CP, 2-D-2CP) established one hydrogen bond as in Table 5. Table 5 displays list of docking include essential information of DehLBHS1-ligand complexes with hydrogen bond lengths and binding energies (kJ/mol). The docking of DehLBHS1with 2,2-DCP, L-2CP, and D-2CP showed similar level of interactions. Docking of the DehLBHS1 with each of 2,2-DCP complex showed the moderate binding energy estimated at -2.5 kJ/mol compared with the D,L-2CP and 3CP at -3.5 kJ/mol and -3.8 kJ/mol, respectively.
It is to highlight that the DehLBHS1-3CP complex with the least binding energy does not form any hydrogen bonds interaction. This indicates the DehLBHS1-3CP complex is not capable of creating polar interaction with the nearby residues. However, Asp9, Tyr11, Ile44, Phe59, Asn118, Asn176, and Trp178 provide weak interactions by hydrophobic contacts with the substrate (Figure 5A). Docking of DehLBHS1 with 2,2-DCP offers a hydrogen bond formed between N2 of guanidine side chain group of Arg40 and the oxygen atom of 2,2DCP carboxylic group (1.8Å) as in Figure 5B. DehLBHS1 with 2,2-DCP offers shortest hydrogen bond distance than other substrates represented by DehLBHS1-D2CP and DehLBHS1-L2CP. Additionally, 7 close contact residues such as Asp9, Tyr11, Phe59, Lys150, Asn118, Asn176 and Trp178 were shown to provide interaction with 2,2-DCP. Therefore, the number of binding residues for 2,2-DCP is the highest compare to other susbtrates. Correspondingly, the docking of the DehLBHS1-D2CP complex and DehLBHS1-L2CP complex showed a hydrogen bond formed between the amino (N) group of Asn118 and the oxygen (O) atom of D- & L-2CP carboxyl (1.9Å). Both complexes had susbtrate interaction with 6 similar residues namely Tyr11, Arg40, Phe59, Lys150, Asn176, and Trp178 (Figure 5C) and L-2CP has two additional interacting residues which are Ile44 and Thr117 (Figure 5D). Thus, this study somehow confirms the preliminary experimental study by Wahhab et al. [26] about the efficiency and specificity of the DehLBHS1 of strain BHS1 to degrade 2,2 DCP (2,2-dichloropropionic acid) and stereoselectivity preference for L-2CP substrate. We also performed multiple sequence alignment with dehalogenase enzymes used as top threading templates by I-Tasser. Sixty-one conserved residues were identified during the alignment, and five residues were related to the binding interactions identified from the docking (Figure 6). We conclude that the Asn188 and Ala40 are important residues which form hydrogen bond with 2,2-DCP and D,L-2CP, and three residues namely Phe59, Asn176 and Trp178 are being involved in most of the interactions. All these five residues were actually identical with the related dehalogenases, from Pseudomonas sp. YL and Polaromonas sp.
Molecular Dynamics of DehLBHS1-haloalkanoic acids
MD simulation provides comprehensive information on the dynamics and flexibility of protein when bound to various ligands or substrates. Furthermore, MD simulations reveal the possibility of the protein-ligand complex achieving structural stability [52]. This study concentrate on molecular adjustments of DehLBHS1 entailing the stability and flexibility in association with degradable and non-degradable substrates 2,2-DCP and 3-CP, respectively. Typically, this calculated by RMSD, RMSF, radius of gyration and the hydrogen bonds formation.
The RMSD analysis of the protein backbone was determined to explain the conformational modifications of the two distinct ligands. In particular, the minimum value of the (RMSD ~0.2- ~0.3 nm) means protein complex is in a good stability state [53]. In this research, Molecular Dynamics (MD) trajectory simulation was closely monitored for degradable (2,2DCP) and non-degradable (3CP) substrates for about 30 nano-seconds (ns) as in Figure 7.
In comparison, the apo-formed DehLBHS1 depicts RMSD value of ~2.5Å – ~3.0Å, while the complex of DehLBHS1-2,2DCP shows a lower deviation of ~1.6Å especially between 10 – 20 ns. While the complex of DehLBHS1-3CP shows a bit higher variation of about 2.2Å before the three of the complexes reach a plateau at 23 ns. As shown from the results, the MD trajectory of the DehLBHS1-3CP complex was rather irregular and fluctuated more often than the DehLBHS1-2,2DCA complex.
This gives the indication that 3CP is a less preferred ligand for the DehLBHS1 even though it has the lowest binding energy according to the docking analysis. For better clarity, RMSD of the substrates in the active site was also calculated and clearly showing a more stable deviation pattern of about 1.1Å for 2,2-DCP compared to 3CP (~1.3Å) as shown in Figure 8. At this stage, we can see the 2,2DCP is more favorable substrate to interact with the DehLBHS1 when compared with the overall molecular motion of 3CP substrate.
It is necessary to look upon the local flexibility and residues contributes to the protein dynamics. Root mean square deviations (RMSF) gauges the specific residue flexibility or the degree of certain individual residue movements (fluctuates) in the course of a simulation. To note, level of movement calculated by RMSF portray the structural stability of protein [54,55]. The structural fluctuation of the protein backbone and the sidechains can result in conformational adjustments and can also impact the preferred structural limitations of the substrates at the active site [56]. For our findings, the RMSF pattern in Figure 9 for all complexes represented by DehLBHS1-2,2DCP and DehLBHS1-3CP were had similar values and indicated reasonable stability throughout 30 ns. Obviously, the N-terminal has highest fluctuation, indicated a high flexibility region. RMSF plot contains five regions at peaks higher than 1.5Å, which may probably connected with the greater flexibility of the loop structures [57]. Interestingly, the residues existing in the core domain area have low RMSF values, compared to the surface with exposed loops [58].
It is also observed that DehLBHS1-2,2DCP ligand complex has a low value for the RMSD, this in turn, indicates the ligand was tightly bound to the enzyme. Thus, in the MD simulations, the stability of the halogenated organic ligand-bound DehLBHS1 complexes was noted. Contrariwise, the higher value of RMSD, the weaker bonding of the substrate with the DehLBHS1, as being represented in the DehLBHS1-3CP complex.
The RMSF at the active site is just as important as the calculation of the RMSF plot for the DehLBHS1-substrates 2,2DCP and 3CP complexes. Therefore, the atomic fluctuation level for these only substrates 2,2DCP and 3CP were calculated. These findings show that the Chloride (Cl) fluctuation for 3CP (0.5A) has a greater value than Cl fluctuated for 2,2DCP as shown in Figure 10. It may due to the 2,2DCP substrate has two chloride atoms which bounded to the alpha-carbon compared with 3CP which has one atom of chloride at beta-carbon and distant from carboxyl group, thus make itself more flexible. 2,2DCP has lower RMSF values than 3CP indicating a more stable conformation in the vicinity of the binding site even though both substrates show a minimum distance with the active site residues. The chloride in 2,2-DCP is in good orientation and could readily cleaved by any water molecule for hydrolytic mechanism.
This study also tracked the Rg (Gyration) value of the DehLBHS1-substrate complex to calculate the degree of compaction and to monitor the total dimensions of the structures during the MD simulation. Rg corresponds to the mass-weighted, relatively constant Rg value represents a stable folded structure whilst the unfolded structure allows the Rg value to shift through simulation [59]. The highest peak of an Rg plot is formed by amino acids being packed more loosely, whereas the lowest is due to tighter packing [60].
For a 30 ns simulation time at 300 K, the plot of Rg of C-alpha atoms shows 2,2-DCP, 3CP and DehLBHS1 in apo form. The range of Rg values of DehLBHS1-2,2DCP, DehLBHS1-3CP and DehLBHS1-apo is from 17.0 to 18.25 Å as in Figure 11. In comparison to the apo form, the complexes of DehLBHS1-2,2DCP and DehLBHS1-3CP are less compacts around ~1.25Å as seen in Figure 11 in order to accommodate the substrates through the induced-fit conformation. For the bounded state, DehLBHS1- 3CP shows more tighter than 2,2DCP along the Gyration (Rg). The different values can be observed between the first and last ten nanoseconds. At this time frame, the 2,2DCP has a higher Rg value than 3CP and apo form, indicating adaptability of active site to orient the substrate in the functional domain.
Secondary, tertiary, and quaternary structures of proteins are formed by hydrogen bonds, which are thought to be the primary constituents of biomolecular structures. The loss of hydrogen bonding can obstruct proper folding which can have a major effect on structural integrity. Hydrogen bonds play an essential in molecular recognition as well as the overall stability of the protein structure [61]. In the water environment, the presence of hydrogen bonds is essential for protein-ligand binding, particularly whenever hydrolysis is necessary to complete the reaction process. In this study, the observed pattern of substrate and protein hydrogen bond number of 2,2DCP-DehLBHS1 is seen to be more consistent, indicating that hydrogen bond is constantly formed through the last 15 ns compared to 3CP indicating that hydrogen bond is continuously formed through out 30 ns compared to the 3CP (Figure 12). The hydrogen bond formation was supported by the close distance values between substrate and DehLBHS1. 2,2DCP and 3CP had similar molecular distance in the range of 1.5- 3.0 Å and were interacted closely with the enzyme over 30 nanoseconds (Figure 13).