PfAP2-I structure prediction
The experimental 3D structure of PfAP2-I was modeled because it is not available in the Protein Data Bank (PDB) [18] as well as UniProt Knowledge base (UniProtKB) [19]. The protein ID of the target (Plasmodium falciparum Apetala 2-Invasion 3D7 strain) was retrieved from National Centre for Biotechnology Information (NCBI) with the accession number PF3D7_1007700.
Afterward, the protein ID was submitted to the SWISS-MODEL web server [20] to develop a homology model with sufficient query sequence coverage and sequence identity. The confident match to a protein of known structure was below 40%, so comparative modeling of PfAP2-I could not be done. The 3D structure of PfAP2-I was then modeled on both the I-TASSER server (http://zhanglab.dcmb.med.umich.edu/I-TASSER) [21] and ROBETTA Baker server (http://robetta.bakerlab.org) [22] using RoseTTAFold.
RoseTTAFold is the default option that uses a deep learning-based modeling method. This method outperforms every other way for protein structure modeling on the ROBETTA Baker server. The most reliable 3D structure was selected based on the confidence value. The confidence values are usually between 0.00 (bad) and 1.00 (good), and the higher the number, the higher the reliability of the predicted structure.
Structure validation of modeled protein
PROCHECK [23] and ERRAT [24] on UCLA-DOE LAB – SAVES v6.0 were used to check for the quality of the modeled 3D structure of PfAP2-I generated on the ROBETTA Baker Lab. The .pdb file format of the modeled PfAP2-I was uploaded on the UCLA-DOE LAB – SAVES v6.0 site for this structure validation. The .pdb file format of the modeled PfAP2-I was uploaded on the server to obtain the overall quality factor from ERRAT and Ramachandran plot and the Ramachandran plot statistics from PROCHECK. The overall quality factor is expressed as the percentage of protein for which the calculated error value falls below the 95% rejection limit. Good high-resolution structures usually produce values around 95% or higher.
The Ramachandran plot is used in accessing the quality of a modeled protein or an experimental structure, while the Ramachandran plot statistics provide information on the total number of amino acid residues found in the favorable, allowed, and disallowed regions [23].
Active site prediction of AP2-I modeled structure and PfBDP1
The crystal structure of PfBDP1 was retrieved from Protein Data Bank (PDB) (www.rcsb.org/structure/7M97) [18]. The active sites of modeled PfAP2-I 3D7 and PfBDP1 structure were predicted using Computed Atlas of Surface Topography of proteins (CASTp) 3.0. [25] and ConCavity [26]. The Computed Atlas of Protein Surface Topography (CASTp) is an online service for identifying, defining, and quantifying certain geometric and topological features of protein structures such as surface pockets, interior cavities, and cross channels (Dundas et al., 2006), while ConCavity is an online service used for predicting Protein-Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure and works based on confidentiality score (C-score). C-score is a confidence score of the predicted binding site. C-score values range between 0-1, where a higher score indicates a more reliable prediction. The modeled PfAP2-I and PfBDP1 3D7 structure were submitted on the server. The necessary amino acids for binding interactions predicted by the two servers were compared to determine the similarity between the two predicted active sites.
Pharmacophore modeling
A pharmacophore model using the prepared modeled 3D structure of PfAP2-I TF was designed using pharmit server [28]. Pharmit server is a collection of built-in databases such as Molprot, ChEMBL, ZINC, and PubChem. It contains millions of chemical compounds that can be used to screen drug-like compounds against a given protein [29]. The Pharmit server is based on a pharmacophore model using the AutoDock Vina scoring function [30]. A control ligand (3W7 from COACH server) was selected for the screening [31], and both the modeled protein and control ligand were loaded into the Pharmit Server. The pharmacophore model was built using six features, i.e., one hydrogen donor, two hydrogen acceptors, one aromatic, and two hydrophobic. The pharmit filters hit screening for the pharmacophore modeling were set using the Lipinski rule of 5 to minimize the results significantly and obtain the best possible inhibitors out of millions of drug-like compounds.
Pharmit filter hit screening based on Lipinski rule of 5
0 ≤ Molecular weight ≤ 500
0 ≤ Rotatable bonds ≤ 10
0 ≤ LogP ≤ 5
0 ≤ Polar Surface Area ≤ 140
0 ≤ Molecular weight ≤ 500
0 ≤ Hydrogen Bond Acceptor ≤ 10
0 ≤ Hydrogen Bond Donor ≤ 5
Protein and Ligand Preparation
The modeled protein structure was defined as a receptor while the complexed ligands were removed using Chimera software [32]. Furthermore, the protein was prepared by computing Gasteiger charges, adding polar hydrogens, and merging the nonpolar hydrogens using AutoDockTools4.2.6. [33]. In addition, OpenBabel software [34] was used to convert the .pdb files to the AutoDock docking format (. pdbqt), which was further used for the docking simulation.
Virtual screening analysis
The virtual screening of compounds was carried out using AutoDock Vina, an accessible graphical user interface (GUI) for the AutoDock 4.2 program [35]. The grid box was constructed using 80, 80, and 91 pointing in x, y, and z directions, respectively, with a grid point spacing of 0.375 Å. The center grid box is of 108.636 Å, 73.665 Å, and 158.751 Å around THR508A, TRP510A, LYS512A, THR514A, THR515A, GLU516A, GLU520A, TYR521A, LEU522A, GLN535A, VAL554A, LYS555A, TYR557A, GLY558A, GLN561A, ALA562A, HIS585A, VAL586A, HIS587A, GLY588A, ARG590A, LYS591A, VAL593A, ASP594A, THR598A. These amino acids were selected based on the CASTp and Concavity result. The top ten (10) hits against PfAP2-I were then generated and were ranked according to their binding affinities to verify the ligand-binding sites. The top six (6) best-docked compounds from PfAP2-I docking analysis were also docked against PfBDP1 active site. Post-screening analyses were conducted using AutoDockTools, and LigPlot [36].
In silico drug-likeness and toxicity predictions
The in silico drug-likeness and toxicity predictions of the designed ligands were carried out using the Swiss ADME predictor [37] and OSIRIS Property Explorer [38]. SwissADME predictor provides information on the Oral bioavailability, Physicochemical properties, Lipophilicity, Water solubility, Pharmacokinetics, Druglikeness, and Medicinal chemistry of the compounds [39]. OSIRIS Property Explorer program, on the other hand, provides information on a compound's toxicity and determines parameters such as Molecular weight, Consensus lipophilicity (cLogP), Total polar surface area (TPSA), Solubility, Drug-likeness, and Drug score, as well as the mutagenic, tumorigenic, irritant and reproductive risks [40].
Drug-likeness is a criterion for determining if a pharmacological substance possesses the characteristics of an orally active drug [41]. The Lipinski rule of five is an established concept upon which drug-likeness is based. The law states that for a compound to exhibit drug-likeness and to avoid poor absorption or permeation, the combination must not possess more than 5H-bond donors, more than 10H-bond acceptors, molecular weight must not be greater than 500, and the calculated LogP (cLogP) must not be greater than 5 [41].
Another parameter used to select compounds as drug candidates are drug score. A high drug score value signifies a high probability of the compound being considered a drug candidate [43].