3.1. Molecular docking
Figure 1 provides a flow chart showing the stepwise screening of African derived terpenoids for potential inhibitors of membrane-mediated SARS-CoV-2 cell entry proteins.
The result from the docking analysis of the reference inhibitors and bioactive terpenoids with the human ACE2, TMPRSS2 and SARS-CoV-2 spike protein is shown in table S3 (supplementary material). The top 20 terpenoids with the highest binding affinity for the ACE2 were further analyzed for binding interactions with SARS-CoV-2 chimeric Receptor-Binding Domain complexed with its human receptor ACE2 (ACE2-RBD) and the S protein of SARS-CoV and MERS-CoV as shown in table 2.
The docking analysis revealed that the reference inhibitor (MLN-4760) to the human ACE2 protein had binding energy of -7.7 Kcal/mol respectively, while Camostat an inhibitor of TMPRSS2 had a binding energy of -7.6 Kcal/mol as represented in figure 3. It was further observed that the topmost docked terpenoids to the ACE2 had higher binding affinity for the S protein of SARS-CoV and MERS-CoV than SARS-CoV-2. More than 10 terpenoids had higher binding affinity than the 3 inhibitors used in this study table S1 (supplementary material). The top 20 docked compounds to SARS-CoV-2 S-proteins had higher binding affinity than nelfinavir mesylates S2 (supplementary material).
From the binding scores generated by the interacting terpenoids with the ACE2 and TMPRSS2 proteins, the 2 best docked terpenoids with the highest binding affinity are: 24-methylene cycloarteno and isoiguesterin with the corresponding binding energy of -9.7, and -9.5 Kcal/mol respectively. The 2 best docked terpenoids to SARS-CoV-2 S protein are 3-benzoylhosloppone and cucurbitacin with binding energies of -9.4 and -9.3 Kcal/mol respectively. 3-benzoylhosloppone had the highest binding affinity for SARS-CoV-2 S protein and the second top binding affinity to MERS-CoV S protein (Figure 3).
3.1.1 Amino acid interaction of selected terpenoids with target proteins.
The amino acid interactions of the human target proteins (ACE2 and TMPRSS2) with reference inhibitors and plant derived terpenoids that demonstrated the highest binding tendencies are represented in table 1. In the same way, the amino acid residues of the coronaviruses S protein that interacted with reference inhibitors and terpenoids with the highest binding affinity are shown in table 2. The interacting residues of ACE2 and TMPRSS2 with respective ligand groups were majorly through hydrophobic interactions and H-bond. Few H-bonding below 3.40 Å were observed with coronaviruses S protein (table 4).
The binding of MLN-4760 to ACE2 showed that it was docked into the N terminus and zinc-containing subdomain I of ACE2 (figure 4a). MLN-4760 exhibited several types of hydrophobic interactions (Pi-Sigma, Pi-Pi T-Shaped, Pi-Alkyl and Alkyl) with TYR510 PHE504 MET360, LYS363 and CYS344, a Salt and attractive charges to ARG514, ARG518 and ARG278 and hydrogen bond to TYR515, THR371, PRO346 and ARG273 (figure 4a).
24-methylene cycloartenol the best docked terpenoid was docked into the C terminus-containing subdomain II of ACE2 but interacted with different residue as with the case of N-acetyl-D-glucosamine (figures 4b). 24-methylene cycloartenol interacted via H-bond to TRP163, SER170 and TYR497.A Pi-Alkyl interaction was also observed with TYR613, PRO492 and VAL491. Isoiguesterin interacted via H-bond to ASP350, TYR385 and ASN394. A Pi-Alkyl and Alkyl interactions was observed with the ALA99, PHE40, PHE390 and LEU73, TRP69 residues respectively in a similar binding pattern with MLN-4760 (figure 4c).
Camostat was docked into the S1-specificity pocket of TMPRSS2 (figure 5a). It interacted via conventional H-bond to five amino residues (ARG41, SER195, TRP215, ALA190 and ASP189) and via carbon hydrogen bond to GLN192 of TMPRSS2. The conventional H-bond was formed in the direction of the guanidine group in this order: first ester bond, second ester bond, while the last three residues interacted with amidino nitrogen of guanidine group respectively. The phenyl ring was responsible for the carbon-hydrogen bond with GLN192 (figure 5a). T3 and T4 were docked into S1-specificity pocket of TMPRSS2 in a similar binding pattern as in the case of camostat (figure 5b & 5c). The only difference observed between the binding pattern of T3 and T4 was an additional H-bond between T3 with ARG41 (figure 5b).
Nelfinavir mesylates an inhibitor of SARS-CoV and MERS-CoV S protein interacted in its best docked conformation to the S protein of SARS-CoV-2 in a different manner. Nelfinavir mesylates was docked into the S2 Subunit of SARS-CoV S protein (figure 7a). The same inhibitor was docked into to the N-terminal domain (NTD) region of the S1 subunit of SARS-CoV-2 and MERS-CoV S protein (figure 6a & 8a).
3-benzoylhosloppone with the highest binding affinity for SARS-CoV-2 S protein interacted via H-bond to THR547; Alkyl interaction to PHE541 and Pi-Alkyl interaction to PRO589 and LEU546. The region of interaction was between the CTD and SD1 region of S1 subunit of SARS-CoV-2 S protein. Cucurbitacin B was docked to the S2 subunit of SARS-CoV-2 S protein but interacted with different amino acid residue. The interaction of cucurbitacin B to the protein was via H-bond to ARG1091, ASN914, THR912 and GLN1113; Pi-Sigma bond to PHE1121 and Alkyl interaction to ILE1114 and GLY1124 (figure 6c).
The same pattern of interaction was observed in both 7-Deacetoxy-7-oxogedunin and 3-friedelanone to the S2 subunit of SARS-CoV S protein. Both terpenoids interacted via a H-bond to ARG982 and GLY726 of the S2 subunit. While 7-deacetoxy-7-oxogedunin interacted with the upstream helix and central helix, 3-friedelanone interacted with the connecting region of the S2 subunit. A hydrophobic interaction via Pi-Alkyl and alkyl bonds was observed with the remaining amino acid residue (table 2; figure 7b & 7c)
7-Deacetoxy-7-oxogedunin interacted via H-bond to the SER51 residue of N-terminal domain of the S1 subunit of MERS-CoV S protein. A Pi-Pi T-shaped interaction was formed between 7-deacetoxy-7-oxogedunin and PHE354; HIS670 of MERS-CoV S protein. Other hydrophobic interactions via Pi-Alkyl and Pi-Sigma bonds were observed to with the remaining amino acid residues (table 4; figure 8a & 8b). 3-benzoylhosloppone interacted via: Pi-Sigma interaction to (PHE341) of NTD; Pi-Pi Stacking to (MET698) of SD2; Pi-Alkyl interaction to (LYS689) of SD2; and an Alkyl interaction to (LEU344 and ILE337) of NTD with the S1 subunit (figure 8c).
In summary, the binding of ligands to various proteins revealed eight terpenoid with remarkable binding affinities. Those with very good interactions with ACE2 and TMPRSS2 are 24-methylene cycloartenol; isoiguesterin; 11-Hydroxy-2 - (3,4-dihydroxybenzoyloxy) abieta-5,7,9(11),13-tetraene-12-one; and 11-Hydroxy-2 -(4-hydroxybenzoyloxy)-abieta-5,7,9(11),13-tetraene-12-one. Similarly, 3-benzoylhosloppone, and cucurbitacin B interacted well with SARS-CoV-2 spike protein, while 7-deacetoxy-7-oxogedunin, and 3-friedelanone interacted well with SARS-CoV and MERS-CoV spike protein.
3.1.2 Energy profile of best docked terpenoids to respective proteins
The overall energy profiles of terpenoid-protein complexes in the selected clusters with the best docked poses are shown in figures 9-11. Figure 9a-11a shows the breakdown of the binding energy of the selected cluster into different contributions. Gauss 1 (blue) and 2 (leaf green) bars: represent the non-bonding interactions, red bar: repulsion, light blue bar: Hydrophobic, Purple bar: Hydrogen bonds, light green bar: rotational forces, while the black bar represents total binding affinity which is a representative contribution of all bonding and non-bonding interactions between the terpenoids and the protein residues. The contributions of the various type of interaction as presented in graph (figure 9a-11a) shows that of the total binding energy of -9.7 Kcal/mol exhibited by the binding of 24-methylene cycloartenol to the ACE2, -2.1 and 1.8 Kcal/mol of hydrophobic and H-bond energies respectively was contributed, while the rest was contributed by non-bonding interaction mainly van der Waals, repulsive and rotational forces. A H-bond, Hydrophobic interaction and repulsive energy of -2.8 -0.8 and +2.3 Kcal/mol respectively was contributed to the total binding energy of -10.0 Kcal/mol between T3 and TMPRSS2. A hydrophobic interaction energies of -2.1, -0.6 and -1.5, a H-bond energies of 0.3, -0.6, -0.3 Kcal/mol was contributed to the total binding energy of the spike protein of SARS-CoV-2, SARS-CoV and MERS-CoV with respective terpenoids. The rest of the energy was contributed by non-binding interactions.
Figures 9b-11b shows the overall energy profile of the ligand-receptor complex of the selected cluster, showing the individual energetic contributions for each atom in the ligand. The colour indication is similar to (a) above.
3.2 Molecular Dynamics Simulation
Four compounds which are, camostat, 11-hydroxy-2 - (3,4-dihydroxybenzoyloxy) abieta-5,7,9(11),13-tetraene-12-one, 24-methylene cycloartenol, and 3-benzoylhosloppone, were analysed for their interactions with SARS-CoV-2 Spike glycoprotein (S protein), Angiotensin-converting enzyme 2 (ACE2), and Transmembrane protease serine 2 (TMPRSS2) proteins. Molecular dynamics simulation was done on each of the protein-ligand complexes and the trajectories were analyzed. The Radius of Gyraion (RoG), Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), and Surface Accessible Surface Area (SASA) results were calculated for each trajectory and are shown in figure 11. There was no observed difference between the RoG of TMPRSS2_camostat and TMPRSS2_(11-hydroxy-2 - (3,4-dihydroxybenzoyloxy) abieta-5,7,9(11),13-tetraene-12-one) complexes, while that of ACE2 is larger and S protein has the largest values because it is a monomer of the SARS-CoV-2 spike trimer. All of them are fluctuating about certain values. The RMSD values of TMPRSS2_(T3), ACE2-24_methylene cycloartenol), TMPRSS2_camostat, and S protein_(3-benzoylhosloppone) complexes are around 2.13 Å, 3.6 Å, 2.14 Å, and 16.78 Å, respectively. While the RMSF values for TMPRSS2_(11-hydroxy-2 - (3,4-dihydroxybenzoyloxy) abieta-5,7,9(11),13-tetraene-12-one), ACE2_24-methylenecycloartenol), TMPRSS2_camostat, and S protein_(3- benzoylhosloppone) complex are fluctuating around 0.68 Å, 1.29 Å, 0.73 Å, and 7.36 Å, respectively.
TMPRSS2_(11-hydroxy-2 - (3,4-dihydroxybenzoyloxy) abieta-5,7,9(11),13-tetraene-12-one), ACE2_(24-methylene cycloartenol), and TMPRSS2_camostat complexes have a spike in the end of their RMSF results indicating the motion of the terminals. The spikes in the middle and the start of the RMSF of ACE2_(24-methylene cycloartenol) complex between amino acid 265 and amino acid 443 and spikes in S protein_(3-benzoylhosloppone) complex corresponds to the loops in the two protein respectively (Figure 12). The values of SASA can be found to be nearly stable for each complex but differ from each other.
Molecular Mechanics/Generalized Born Surface Area (MMGBSA) algorithm in AmberTools 20 was utilized to calculate the ligand binding free energy. All frames (~1000 frame) were used in this calculation for each protein-drug complex. Figure 14 shows the binding affinity in Kcal/mol from MMGBSA analysis with Standard Deviation as error bars for each protein-drug complex. The best binding affinity (more negative) is for TMPRSS2_camostat (-53.5059 Kcal/mol) which indicates the strong binding between them.
Table 4 shows the number of clusters, representative frame produced for each trajectory, and the interaction types using PLIP webserver. Hydrophobic, H-bond, salt-bridges, pi-cation and pi-stacking are the types of interactions found by PLIP webserver. Most of complexes have H-bond and hydrophobic interactions, with TMPRSS2_camostat having the largest number of bonds in each cluster compared to other complexes. Figure 15 shows the protein-drug cluster representatives for the protein-ligand complexes and the mode of interaction in the enlarged part of the image. Images were generated using PyMol software V 2.2.2.
3.3 Drug likeness and Pharmacokinetic properties of selected terpenoids.
The result generated from the Lipinski and ADMET filtering analyses are represented in table 4 and figure S1 (supplementary file).
Four terpenoids T1, T3, T5 and T6 fulfilled the requirement for Lipinski analysis of the rule of-five with corresponding favourble predicted ADMET parameters.
The in silico predictive pharmacokinetic and ADMET properties from the filtering analyses suggested T1, T3, T5 and T6 with a high probability of absorption, subcellular distribution, low toxicity. Though pharmacokinetic analysis indicated T1 (Table 4) to be less soluble while the ADME/tox analysis indicated high aqueous solubility, ability to pass the high human intestinal absorption, low acute oral toxicity with a good bioavailability score as exhibited by T3, T5 and T6 (Table 4).