2.1. Accurate ADME property prediction
Accurate ADME prediction prior to costly experimental procedures, such as high-throughput screening (HTS), can prevent testing on compounds that will ultimately fail; ADME prediction can also be used to focus lead optimization efforts to increase the desired features of a given drug. Using ADME predictions in the development process allows for the production of lead compounds with improved ADME properties for use in clinical trials.
Using two programs, QikProp from the Schrodinger Suite and SwissADME online server, the ADME properties of the standard drug (Empagliflozin), four chosen natural compounds (580, 1131, 212, 357, and 822), and three approved drugs as SGLT2i (Ertagliflozin, Dapagliflozin, and Canagliflozin) were calculated. Compounds 306 and 580 were placed in the range of 95% known pharmaceuticals similar to three approved drugs according to Table 1s (a-e). According to reports, the colored area in figure 1S is a favorable physicochemical region for oral bioavailability. It revealed that compounds 306 (A) and 580 (B) had higher unsaturation properties than the industry standard drug(C). Compounds 580 and 306 were reported as being soluble and poorly soluble, respectively, in comparison to the standard with moderate solubility in water. Additionally, 580 had a higher polarity than Empagliflozin. Other characteristics in figure 1S were shown to be like standard.
2.2. Docking study
The SGLT2i drug empagliflozin is used to treat type 2 diabetes 16-23. To validate docking parameters, the co-crystallized ligand in RCSB with PDB ID: 7vsi was re-docked into the active site of the SGLT2-MAP17 complex protein. Re-docking produced essentially the same results, and the reliability of the docking procedure was demonstrated by the RMSD value (0.90) (Table 2). Four SGLT2i and two ligands were docked using the MOE program in order to acquire a comprehensive understanding of their binding mode (Table 1). The examination of the results revealed that the target compounds' docking scores were nearly similar (8.63 to -8.29 kcal/mol) to those of the lead compound, x-ray pioglitazone (-8.57 kcal/mol). Compounds 306, 580, and empagliflozin had a shared bond (H-acceptor) with GLN 457 in their active site. The H-acceptor bond related to GLN 457 for compounds 306 and 580 had better interactions with the SGLT2-MAP17 complex protein at the active site pocket (distance 3.25 and 3.28) compared to the standard compound with a distance of 2.95. Additionally, they had a pi-H bond with residues PHE_453 and PHE_98 for 306 and 580, respectively. MD studies were used to validate the results of docking. Figure 1 and table 2 depict a map of the docking of the three drugs (306, 580, and empagliflozin) in the active site of the SGLT2-MAP17 complex protein (PDB code: 7vsi).
Figure 1.
Table 1.
Table 2.
2.3. Molecular Dynamic Simulation (MD)
By using MD modeling, it was possible to better understand how well SGLT2 functioned in terms of stability, ligand-receptor interaction, and ligand binding. Using plots of root-mean-square derivations (RMSD), root-mean-square fluctuation (RMSF), and radius of gyration (Rg), the dynamic stability of secondary structures and conformational changes in protein alone were compared with the complex. After approximately 20 ns of simulation, the RMSD of the complex and the apo-form stabilized, as illustrated in figure 2. In terms of mean RMSD, apo-form and complex 306 and 580 had mean RMSD values of 0.264 ± 0.0323, 0.243 ± 0.023, and 0.220 ± 0.014 nm, respectively. The fact that SGLT2's conformational flexibility had decreased as a result of its binding to the ligand and protein was likely the cause of the complex 580's comparatively low RMSD value.
Figure 2.
Figure 3.
In addition to dynamic measurements for each atom or residue, the RMSF is macromolecularly flexible and displays any regional protein structural modifications 24,25. Figure 3 and 4 show the RMSF plots of protein in complex and in apo-form. The apo-form and complex of 306 and 580 had average RMSFs 0.134 ± 0.063, 0.127 ± 0.0869, and 0.128 ± 0.0741 nm, respectively. Two aspects of protein structure are determined by these three systems: (a) the same RMSF dispensation, and (b) the same orientation of dynamic characteristics. The complex's residues' lower RMSF values may be explained by the fact that they are in the binding site (residues present in the active site for compound 306: Leu_283, Tyr_290, Phe_453, Asp_454, Tyr_526, His_80, and Tyr_290; residues present in the active site for compound 580: Phe_98, Val_157, Gln_457, Asn_75, Lys_321, Ser_393, and His_80). Protein flexibility did not alter noticeably due to residue binding to the ligand.
Figure 4.
The Rg represents the equilibrium conformation of the bound and unbound systems and is a protein structural compression indicator 26. For the apo-form and protein-ligand complexes 306, and 580, the Rg values were 2.422 ± 0.011, 2.426 ± 0.009, and 2.440 ± 0.009 nm, respectively (Fig. 5). The protein-ligand complex's comparatively high Rg value indicates greater compaction of the complexes.
Figure 5.
The protein-ligand combination is stabilized by hydrogen bondingIn Figure 2S, the restricted ligand in the active site of the protein is shown to be responsible for the stability of this system. Empagliflozin (standard), 306, and 580 had average Hbond intermoleculars of 2.913 ± 1.1742, 0.411 ± 0.6548, and 5.980 ± 1.1352, respectively. The results showed that, compared to the standard ligand, ligand 580 formed three more hydrogen bonds with the receptor (a protein). The protein had an average of 451.251 ± 9.756 intramolecular hydrogen bonds. The hydrogen bond provides the SGLT2 secondary structure's stability (Fig. 3S). The potential energy, which remained constant over 100 ns of simulation in this work (for 306: -789401.799 ± 1106.722 and for 580: -788706.174 ± 1119.781kcal/mol) and proved the stability of the system, is a straightforward method for assessing protein-ligand interaction (fig. 4S).
Protein function depends on folded structures known as secondary structures (SS) 27,28. VMDSS was used to calculate the secondary structure during the MD simulation. After analyzing the trajectory of SS%, the proportion of protein secondary structures was determined (Fig. 5S) 29.
Different colors were used to identify secondary structures, which remained unaltered during simulation 29. Following 100 ns of simulation, the secondary structure of the complex was displayed in a three-dimensional column chart (Fig. 6S (Down part)) and a pie chart (Fig. 6S (High part)). Every fifty ns, the ligand-bound protein structure was extracted from trajectories in order to analyze the simulation's results. As can be seen in snapshots from the simulation in figure 7S, the orientation of the ligand in the SGLT2 protein's active site remained constant throughout the simulation. After 100 ns of simulation, compounds 306 and 580 had two, and six hydrogen binds, respectively, according to figure 9S and table 1, while the standard ligand, empagliflozin, had four hydrogen bindings. The outcomes demonstrated that compound 580 was inserted into a matching pack, and the complex configuration was constant. Figure 8S depicts 3D images of hydrogen binding between three ligands and residues in the active site, and Figure 8S (B) depicts a 3D image of hydrogen binding between compound 580 and six residues located in the pocket. Because there is a greater connection between ligand 580 and receptor following MD compared to the standard drug, it could be inferred that 580 operates more steadily in the active site of the receptor.
2.4. Metabolic process prediction in the presence of cytochrome P450
For the two selected compounds and, empagliflozin, table 2S shows ranking, atom type and quantity, score, energy, 2DSASA, span2end, and similarity. The three atoms with the highest metabolic priority are highlighted on the left and right images of table 2S. SMARTCyp 3.0 is an online server-based application for predicting the metabolic reactivity of three well-known isoforms of cytochrome, namely Cyp3A4, Cyp2D6, and Cyp2C9. This server was applied to anticipate the metabolic sites of both the chosen compounds and empagliflozin. Each compound has an energy below 999, a span2end value less than 8, a similarity of 0 to 1, etc 30. Therefore, the given data indicates that all three compounds have acceptable and comparable values to understand the metabolic reactivity towards cytochrome isoforms. The three sites with the lowest scores for the three cytochrome enzyme isoforms 3A4, 2D6, and 2C9 had the highest likelihood of metabolism, according to the data analysis using SMART Cyp 3.0. For the standard drug, empagliflozin, the most important sites with the highest likelihood of metabolism are C.14, C.16, and C.29, which have the lowest scores. Site C14 is also a key site for the creation of adducts for all three isoforms. The most well-known site of metabolism in compound 580 is atomic site C.6, which has the lowest score, and the most metabolically reactive sites in that are C.6, C.3, and C.9. In compound 306, atomic positions C.29, N.28, and C.27 are the major adduct formation sites for isoform 2D6. Also, Atomic positions C.15, N.13, C.29, and C.15, N.13, C.10 are related to major adduct formation sites of two isoforms 2C9 and 3A4 of cytochrome enzyme, respectively. The atomic site C.15 has the lowest score, making it the most likely location for the cytochrome isoforms 3A4 and 2C9 to react. Additionally, given its lower score value, the atomic site C.29 is the most likely location of reactivity for isoforms 2D6 (Table 2S).
2.5. Pred-hERG tool for cardiotoxicity prediction
Cardiotoxicity prediction is crucial to maintaining a reasonable level of safety prior to administration into the body. The pred-hERG test results show that compound 580 is not cardiotoxic and does not cause hERG blockage. Probability map results showing denser contour green lines than in standard ligand indicate cardiotoxicity for compound 306 (Table 3). By changing the structural characteristics, it is possible to create structural analogues of compound 306 that have decreased cardiotoxicity.
Table 3.