In silico hydrolysis and prediction of AMP from the parent protein
Proteome sequences were retrieved from the Uniprot database using the keyword “Bungarus caeruleus”. It was further subjected to an Expasy peptide cutter server, and digested with five different enzymes Trypsin, Chymotrypsin-high specificity (C-term to [FYW], not before P), Chymotrypsin-low specificity (C-term to [FYWML], not before P), Pepsin (pH 1.3), and Pepsin (pH > 2) for the collection of shorter peptide sequences; which were further evaluated for the AMP property. These enzymes digest the proteome sequences at specific regions resulting in different lengths of peptide sequences and the benchmark was set for the sequence range from 6–25 amino acids length [13]. The Database of Antimicrobial Activity and Structure of Peptides (DBAASP) server was used for the prediction of AMP from the collected shorter peptide sequences. Based on machine learning algorithm and Moon and Fleming scale, it predicts the antibacterial property of the linear peptide in general and against specific bacterial strain [14].
Physicochemical properties of the predicted AMPs
Physicochemical properties of the predicted peptides were calculated using two web servers such as APD3 and Expasy PROTPARAM, which predicted the various physiochemical properties such as molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY) [15]. The Toxicity property of the predicted AMP was calculated using ToxIBTL (https://server.wei-group.net/ToxIBTL/Server.html), which works on the principle of a novel deep learning framework by using the principle of bottleneck and transfer learning algorithm for the toxicity prediction of the peptides [16].
ADMET property for the predicted AMPs
To evaluate the ADMET properties of the predicted peptides were predicted using ADMETlab 2.0 server (https://admetmesh.scbdd.com/), we converted the amino acid residues to SMILES (Simplified Molecular Input Line Entry System) format using the PepSMI server (https://www.novoprolabs.com/tools/convert-peptide-to-smiles-string) [17]. The key parameter that was inferred for the ADMET profile is Human Intestinal Absorption (HIA), Caco-2 permeability, volume distribution, mutagenicity, carcinogenicity, Central nervous system (CNS) permeability, Drug-Induced liver injury (DILI), different cytochrome enzymes inhibition and substrate, and skin sensitization. Overall, the AMPs with the best ADMET profiles were then, prepared for the molecular docking exercises.
Evaluation of Cell Penetrating, Hemolytic, Toxicity, and Allergenicity Property
Cell penetrating property of the peptides were predicted using Support Vector Machine based algorithm based web server CellPPD (https://webs.iiitd.edu.in/raghava/cellppd/index.html). Peptide sequences were subjected to amino acid residues scanning tool with the default SMV threshold and the results were obtained as CPP and Non-CPP on the basis of the prediction score (Gautam et al., 2013; Gautam et al., 2015). Hemolytic prediction for the peptides sequences were carried out using SVM based web server HemoPI (https://webs.iiitd.edu.in/raghava/hemopi/design.php). PROB score is the normalized SVM score and ranges between 0 and 1, i.e. 1 very likely to be hemolytic, 0 very unlikely to be hemolytic [20]. Toxicity and allergenecity for the peptide were predicted using ToxinPred (https://webs.iiitd.edu.in/raghava/toxinpred/algo.php) and AllerTop server (https://www.ddg-pharmfac.net/AllerTOP) (Gupta et al., 2013; Dimitrov et al., 2014; Gupta et al., 2015) pH-dependent folded and unfolded states were predicted using SVM-based DispHScan (http://disphscan.ppmclab.com/) [24].
Structure Prediction
The generated sequences for the predicted antimicrobial peptides were modelled using PEPFOLD2.0 (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD2/), for the 3D- dimensional structure prediction [25]. It works on the principle of HMM model (Hidden Markov Model) to predict the 3D- structure of the peptides and the peptide length should not be less than 9 amino acid residues [26]. The best ranked model from the results were further validated using PROCHECK server of SAVES v6.0 with the help of Ramachandran plot by the positioning of amino acids residues in the different regions such as allowed, favoured and disallowed regions in the plot Laskowski [27].
Molecular Docking
HADDOCK (High Ambiguity Driven protein-protein DOCKing) is an information-driven flexible docking approach for the modelling of biomolecular complexes. Protein structure for the HTH-type transcriptional regulator EthR of Mycobacterium tuberculosis protein (UniProt Id - P9WMC1) protein structures were prepared for the docking by removing the water molecules and other heteroatom molecules from the protein file and hydrogen bonds and charges were added using AutoDock tool. and peptide were prepared using AutoDock tools for hydrogen, charges and energy minimization. Then prepared structures were uploaded to the HADDOCK server and protein-peptide docking was performed specific to the active site residues [28, 29]. The active site for the target protein was predicted from the Computed Atlas of Surface Topography of Proteins (CASTp 3.2 version), and from the literature previously reported with the binding site. The output results were recorded for different parameters such as HADDOCK score, Van der Waals energy, Electrostatic energy, Desolvation energy, and Z- score. Interaction visualization between protein-peptide complex were studied using BIOVIA Discovery Studio and LigPlot tools.
Protein-peptide interface interaction analysis
To study the protein-peptide interface interactions, PRODIGY web server were employed, it gives the binding energy for the docked protein-peptide complexes in Kcal/mol unit. The binding energy evaluated at the PRODIGY server, was calculated using following equation [30],
∆𝑮𝒄𝒂𝒍𝒄 = 𝟎. 𝟎𝟗𝟒𝟓𝟗𝑰𝑪𝒔 𝒄𝒉𝒂𝒓𝒈𝒆𝒅/𝒄𝒉𝒂𝒓𝒈𝒆𝒅 + 𝟎. 𝟏𝟎𝟎𝟎𝟕𝑰𝑪𝒔 𝒄𝒉𝒂𝒓𝒈𝒆𝒅/𝒂𝒑𝒐𝒍𝒂𝒓 − 𝟎. 𝟏𝟗𝟓𝟕𝟕𝑰𝑪𝒔 𝒑𝒐𝒍𝒂𝒓/𝒑𝒐𝒍𝒂𝒓 + 𝟎. 𝟐𝟐𝟔𝟕𝟏𝑰𝑪𝒔 𝒑𝒐𝒍𝒂𝒓/𝒂𝒑𝒐𝒍𝒂𝒓 − 𝟎. 𝟏𝟖𝟔𝟖𝟏%𝑵𝑰𝑺𝒂𝒑𝒐𝒍𝒂𝒓 − 𝟎. 𝟏𝟑𝟖𝟏𝟎%𝑵𝑰𝑺 𝒄𝒉𝒂𝒓𝒈𝒆𝒅 + 𝟏𝟓. 𝟗𝟒𝟑𝟑
Where, ICs (Inter-residue contacts) and NIS (Non-interacting surface) terms represent the importance of different types of residues interaction in defining the overall binding energy for the interacted complex.
Molecular dynamics (MD) simulation
Molecular dynamics simulation for the protein alone, and protein – peptide and protein-INH complex, which has the maximum binding energy after the docking process was performed for 100 ns. The MD simulations were performed to evaluate the conformational variability and stability during dynamics simulation using GROMACS – 2023.1 version (academic) [31]. The peptide with maximum binding energy against one target protein was prepared using Charmm-GUI force field for the refinement of amino acid interactions during simulation. Meanwhile, the solvation for the protein was done in an orthorhombic box using the TIP3P (Transferable Intermolecular Potential with 3 points) water molecules. Salt ions (Na+) were added to neutralise the system prepared for the solvation. The water box was prepared in such a manner that it completely covered both the protein and peptide complex. Energy minimization and equilibration for the system were carried out through the procedure mentioned in the Desmond suite. The final MDS run was performed for 100 ns with temperature relaxation at 300K, with bar pressure 1 bar for 24 psi in the NPT ensemble, after performing the minimization and equilibration of the solvated protein-peptide complex system. Similarly, the system was generated for the protein-INH complex and MD simulation was performed for 100 ns. The stability and other properties of the interactions during the simulation process were studied using RMSD, RMSF, SASA and radius of gyration other profiles that were interpreted.
Estimation of free energy binding
The interaction free energies of protein-peptide complex were determined using the gmx MMPBSA technique, which is a quantitative calculation of the binding free energy used to examine biomolecular complexes. The binding energy was calculated for the known standard and the predicted peptide bound at the active site of the target protein. The MD simulation trajectories for the 100 ns simulation were utilized to calculate the binding energy of the complexes. Other interaction analysis such as Van der Waals energy (ΔVDWAALS), Electrostatic molecular energy (ΔEEL), polar contribution to the solvation energy (ΔEPB), nonpolar contribution of repulsive solute–solvent interactions to the solvation energy (ΔENPOLAR), total gas phase molecular energy (ΔGGAS), and total solvation energy (ΔGSOLV) respectively.