3.1. Immuno-informatics analyses
3.1.1. Protein sequence retrieval
The protein sequences of PhtD, PsaA, and PspC were retrieved from the NCBI. The C-terminal of PhtD (PhtD-C, 383–853), PsaA, and PspC amino acid chains were applied in the project to predict T helper and B cell epitopes.
3.1.2. Prediction of transmembrane regions and signal peptide
The transmembrane topology investigation showed that the mentioned antigens lacked any transmembrane helices (Supplementary Figure 1). According to the results of the SignalP server, the selected proteins except PsaA did not have a signal peptide (Supplementary Figure 2).
3.1.3. Tertiary structure prediction, refinement, and validation
The 3D structures of PhtD-C and PspC obtained from I-TASSER showed better quality than the models generated by the other servers. Considering the C-score, the model number one of PhtD-C (with C-score: -0.64) and PspC (with C-score: -0.65) were selected as the most suitable 3D-structure among five generated models for each of the proteins. The selected models of PhtD-C and PspC were refined using GalaxyRefine (Supplementary Figure 3). The structure quality of the models was improved following treatment with the server as represented via ProSA Z-score and Ramachandran plot. The results of validation of the protein models before and after refinement are presented in Supplementary Table 1. The ProSA server revealed that refined models of PhtD-C and PspC had the Z-score of -9.1 and -0.04, respectively, which are within the range of scores for similar size natural proteins (Supplementary Figure 4 A and B). According to the assessment of the Ramachandran plot of refined models, there were 81.7%, 16.9%, and 1.4% residues of the PhtD-C, 91.5%, 6.6%, and 1.9% residues of the PspC in the most favored, allowed, and disallowed regions, respectively (Supplementary Figure 4 a and b).
3.1.4. Identification of linear and discontinuous B-cell epitopes
As B-cell epitopes have an essential role in humoral responses, the sequential and discontinuous B-cell epitopes of the mentioned proteins were predicted using LBTope, ABCpred, IEDB Emini tool, and Ellipro (for the sequential epitopes), Ellipro and DiscoTope servers (for discontinuous epitopes). The prediction results obtained by the servers are provided in Supplementary Tables 2, 3, 4, 5, 6, and 7, the first 3 tables are for linear and the last 3 are for discontinuous B-cell epitopes.
3.1.5. Identification of MHCII epitopes
T-helper epitopes from PhtD-C, PsaA and PspC sequences were predicted using IEDB (percentile rank<10) and NetMHCIIpan (%Rank≤1). The different MHCII epitopes from the protein sequences were derived via the servers according to eight HLA-DRB1 alleles (DRB1*01:01, DRB1*03:01, DRB1*04:01, DRB1*07:01, DRB1*08:01, DRB1*11:01, DRB1*13:01, and DRB1*15:01) and three mouse alleles (H2-IAb, H2-IAd, and H2-IEd). The results of predictions are listed in Supplementary Tables 8, 9, and 10.
3.2. Analysis of the interaction of N-PepO with TLR2/4
3.2.1. Retrieval of sequence and modeling of 3D structure
The protein sequence of PepO was fetched from the NCBI. The tertiary structure of N-PepO obtained from I-TASSER showed better quality than the models generated through the other servers. The 3D model number one of N-PepO (with C-score: 1.42) was chosen as the most proper structure among the generated models. The selected model of N-PepO was refined using GalaxyRefine (Supplementary Figure 5A). The results of validation before and after refinement represented that the structure quality of the model was improved. The ProSA server revealed that the refined model had the Z-score of -2.62, which is within the range of scores for similar size natural proteins (Supplementary Figure 5B). According to the assessment of the Ramachandran plot of the refined model, there were 94.4%, 4.5%, and 1.1% residues in the most favored, allowed, and disallowed zones, respectively (Supplementary Figure 5C).
3.2.2. A more detailed identification of interaction of N-PepO with TLR2/4 by molecular docking
To identify the residual interactions between N-PepO and TLR2/4, the refined modeled structure of N-PepO from the previous step and the crystal structures of TLRs 2 and 4 were docked by the ClusPro 2.0 online server. Following docking, it was seen that the N-PepO possessed good interaction with TLR2/4. The model.000.00 for N-PepO-TLR2 and model.000.02 for N-PepO-TLR4 were chosen with the lowest energy docking mode -859.4 and -967.1, respectively. The Discovery Studio Visualizer and DIMPLOT from LigPlot+v2.2.4 were used for visualizing and analyzing the chosen complexes (Figures 1 and 2). The N-PepO-TLR2 complex formed 14 hydrogen bonds and 4 salt bridges, Arg354-Glu264, Lys357-Glu264, Asp34-Arg321, Phe8-Tyr323, Tyr9-Tyr323, Asn13-Tyr323, Gln5-Leu328, Arg3-Tyr332, Gln5-Tyr332, Asp6-Tyr332, Glu15-Lys347, and Arg3- Leu350 (Table 1). Comprehensive insights into intermolecular interactions of the N-PepO-TLR2 showed that all the hydrogen bonds except Asn13-Tyr323 formed at distances below 3 A° (Table 1). Likewise, the N-PepO-TLR4 complex made 10 hydrogen bonds and 2 salt bridges, Thr19-Arg67, Glu21-Arg67, Ala11-Arg87, Ile12-Arg87, Glu15-Arg87, Asn13-Gln91, Tyr4-Ser184, Tyr4- Lys186, Gln5-Lys186, and Gln5-Glu266. All the hydrogen bonds, excluding Ile12-Arg87, formed at distances less than 3 A° (Table 1). Considering the obtained results, a truncated fragment of N-PepO from amino acids 1 to 112, which was potentially involved in the interactions between N-PepO and TLRs, was selected as a potential TLR2/4 agonist candidate for further stage of the research.
3.3. Designing of epitope-based vaccine
Based on the results of the immuno-informatics investigation, epitopes of B-cell and MHC-II with the best scores were considered. The final construct of the vaccine consisting of four domains: the truncated N-PepO (1-112), PhtD-C (196-256), PsaA (100-187), and PspC (276-363) was designed. The chosen domains of PhtD, PsaA and PspC were connected together via GGSSGG linkers. The selected fragment of N-PepO as a potential candidate adjuvant was joined to the N-terminal of the considered domains with the EAAAK linker. Following attachment of the His6 (HHHHHH) tag to the C-terminal of the construct, the epitope-based vaccine (ODAC) was finally designed to be 372 residues in length. The ODAC sequence is as follows:
MTRYQDDFYDAINGEWQQTAEIPADKSQTGGFVDLDQEIEDLMLATTDKWLAGEEVPEDAILENFVKYHRLVRDFDKREADGITPVLPLLKEFQELETFADFTAKLAEFELAEAAAKAAQAYAKEKGLTPPSTDHQDSGNTEAKGAEAIYNRVKAAKKVPLDRMPYNLQYTVEVKNGSGGSSGGSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGGSSGGRRNYPSNTYFSLELEISESDVEVKKAEFELVKEEAKEPRNEEKVKQAKAKVESKKAVATRLENIKTDRKKAEEEAKRKAAEEDKVKEKGHHHHHH.
The details of the chosen epitopes in the designed construct and schematic diagram of the vaccine are presented in Table 2 and Figure 3.
3.3.1 Evaluation of the sequence properties of the designed vaccine
The sequence antigenicity probability was estimated to be 0.748 and 0.926 using the VaxiJen and the ANTIGENpro, respectively. The engineered sequence was predicted to be non-allergenic by Alg Pred and Allertop online servers. According to SOLpro results, the solubility of the candidate vaccine upon overexpression in Escherichia coli was 0.893. ToxinPred was employed for predicting the toxicity of the designed sequence and it was deemed as non-toxic. Furthermore, the ProtParam results indicated that theoretical isoelectric point value (pI) and molecular weight (Mw) of the construct were 5.29 and 41.939 kDa, respectively. Half-life was computed to be 30 hours in mammalian reticulocytes, > 20 hours in yeast, and more than 10 hours in E. coli. Aliphatic index (AI) and grand average of hydropathicity (GRAVY) values were defined as 68.01 and -0.954, respectively. The instability index (II) of the designed sequence was estimated to be 36.83, indicating that it is stable. The total number of positive (+R) and negative (-R) residues were computed 60 and 70, respectively. The details of the sequence properties of the designed vaccine are presented in Table 3.
3.3.2. Prediction, refinement, and validation of tertiary structure of the vaccine
Five 3-dimensional structures of the designed vaccine sequence were generated by the Robetta server, and the best model was chosen. The R-plot revealed that 92.5, 6.9, and 0.6 percent of amino acids were located in the favored, allowed, and disallowed zones (Figure 4A). Moreover, the ProSA Z-score of -6.78 was inside the area of scores for natural proteins, confirming the overall quality and reliability of the designed model (Figure 4B). The ERRAT score of the chosen model was 94.92 percent (Figure 4C). Following the refinement of the structure using Galaxy Refine, the model properties were improved. The primary and the refined structures are compared in Figure 4 and Table 4. The R-plot of the refined model showed that 94.00, 5.1, and 0.9 percent of residues are located in favored, allowed, and disallowed zones, respectively (Figure 4a). Furthermore, the Z-score (Figure 4b), and ERRAT score (Figure 4c) were improved to -6.96 and 96.11 in the refined model, respectively (Table 4). The 3-dimensional structure of the refined model was visualized via Discovery Studio Viewer (Figure 5).
3.3.3. Discontinuous B-cell epitope prediction of the engineered construct
Given the significance of conformational epitopes in immunization, discontinuous B cell epitopes of the 3D structure of engineered construct were predicted based on the interaction of protein and antibody using the Ellipro server. Conformational peptides with a score of >0.5 were chosen, and the scores showed the surface atoms of the construct that are responsible for antibody binding. Table 5 summarizes the number of amino acids, the amino acid compositions, the score values, and the sequence location.
3.3.4. Molecular docking of TLR2/4 and ODAC vaccine
As mentioned above, the first 112 residues of the N-PepO may act as a potential binding site for TLR2/4. The construct of the vaccine was engineered in an effective way to interact with the TLR2 and TLR4 receptors. The process of protein-protein docking between TLR2/4 receptor and vaccine was performed by Cluspro 2.0 online server after eliminating the ligand and water molecules from the crystallographic structure of receptor. Several complexes were generated by the docking, and it was observed that the engineered vaccine could interact well with the TLR2/TLR4. The model.000.00 and model.000.01 were chosen for ODAC-TLR2 and ODAC-TLR4 complexes with the lowest energy docking mode -922.1 and -798.3, respectively. For displaying and evaluating the considered complexes, the DS Visualizer and DIMPLOT tools from LigPlot+v2.2.4 were used (Figures 5 and 6). According to a detailed review of the results, the designed vaccine and TLR2/4 interacted properly via the truncated region of N-PepO. The engineered construct-TLR2 formed 15 hydrogen bonds and 3 salt bridges, Met1-Pro320, Trp50-Arg321, Thr2-Arg321, Trp50-Tyr323, Asp6-Lys347, Glu40-Glu374, Tyr9-Gln396, Asp10-Gln396, Asp34-Lys404, Gln37-Lys404, Asp36-Ser424, Asn13-Arg447, and Glu58-Arg321. Analysis of ODAC-TLR2 intermolecular interaction revealed that only Glu40-Glu374 hydrogen bond formed at distance above 3 A° (Table 5). Also, the designed vaccine-TLR4 made 12 hydrogen bonds and 4 salt bridges, Thr2-Asp95, Arg3-Leu119, Asp59-Asn143, Arg70-Glu169, Trp16-Asn173, His368-His199, Arg70-Gln200, His368-Gln200, His369- Glu225, His372-Arg227, Asp10-Lys150, and Glu15-Lys150. Investigation of ODAC-TLR4 intermolecular interaction exhibited that all the hydrogen bonds formed at distances between 2.59 and 2.99 A° (Table 6). Both chosen complexes were subjected as the initial structures in MD simulation processes.
3.3.5. MD Simulations of selected complexes
The chosen docked structures of the designed vaccine-TLR2/4 were subjected to simulations of MD to calculate the level of their structural stability. On the basis of the GROMOS96 54A7 force field, we conducted a separate 50 ns MD simulation for each docked complex.
3.3.5.1. Investigation of stability
The RMS Deviation was applied for evaluating the vaccine-TLR2/-TLR4 complex stability. The backbone deviations of chosen complexes of the initial conformations were plotted as a function of time. The ODAC-TLR2 and ODAC-TLR4 showed RMSD values in the range of 0.6-0.8 nm after 20 ns and 0.57-0.72 nm after 15 ns, respectively, as they are displayed in Figure 8.
3.3.5.2. Analysis of flexibility
The RMS fluctuation is another way to anticipate the dynamic stability (DS) of the docked complex that evaluates fluctuations of the residues. The RMSF values showed that PepO residues in ODAC had almost the same degree of flexibility as TLR2 amino acids, while the rest of ODAC residues had a greater degree of flexibility than TLR2 residues (Figure 9A). Likewise, as shown in Figure 9B, the RMSF value of residues in ODAC and TLR4 revealed no notable flexibility.
3.3.6. Reverse translation, codon adaptation, and in silico cloning
JCat tool was employed for reverse translation and codon adaptation of the devised vaccine candidate in order to maximal expression of it in E. coli K-12. The vaccine sequence after the reverse translation was 1116 nucleotides long. According to the codon optimization findings, the CAI value and CG content were 0.99 and 46.41%, respectively, which are accepted since they are within the permitted limit. These outcomes confirm a proper expression of the engineered vaccine in E. coli K-12. Ultimately, the ODAC vaccine sequence was successfully cloned into the pET-28a(+) between NdeI and XhoI enzymes by SnapGene software free-trial (Figure 11).