Monkeypox viral proteins retrieval and screening. Monkeypox protein sequences, including structural and non-structural proteins retrieved from MPXV strain Congo_2003_358 on NCBI database (DQ011154.1) were predicted for antigenicity and allergenicity via VaxiJen 2.0 and AllergenFP1.0. From 200 proteins retrieved from the completed genome, only 24 proteins were selected based on non-allergenicity and antigenicity with a score higher than 0.4 as demonstrated in Table 1A and 1B. These selected 24 proteins were further utilized for epitope prediction.
Prediction of epitope candidates and assessment of allergenicity, antigenicity, and toxicity parameters.NetMHCpan 4.1 BA and NetMHCpan 4.1 EL server predicted 35 CTL epitopes which demonstrated strong binding affinity on MHC class I for HLA Class I alleles (HLA-A0101, HLA A0201, HLA-A0301, HLA-A2402, HLA-A2601, HLA-B0702, HLA-B0801, HLA-B2705, HLA-B3901, HLA-B4001, and HLA-B5801). All selected 35 CTL epitopes demonstrated positive immunogenicity score via IEDB Class I immunogenicity tool, antigenicity with a score higher than 0.4, non-allergenicity, and non-toxicity.
HTL epitopes were predicted through NetMHCpanII 4.1 BA and NetMHCpanII 4.1 EL for MHC class II. These servers predicted 24 potential HTL epitopes with strong binding affinity against MHC class II based on HLA alleles DRB1_0101, DRB1_0301, DRB1_0401, DRB1_0701, DRB1_0801, DRB1_0901, DRB1_1001, DRB1_1101, DRB1_1201, DRB1_1301, DRB1_1501, and DRB1_1602. Additionally, those HTL epitope candidates were tested with IFNepitope server to selectively filter those sequences that can stimulate IFN-g. All 24 HTL epitopes could trigger IFN-g secretion. Out of 24 epitopes, only one epitope showed low antigenicity score (less than 0.4) and thus 23 HTL epitopes were reserved with high antigenicity, non-allergenicity, and non-toxicity
In case of LBL, BepiPred Linear Epitope Prediction 2.0 server was deployed to predict the LBL epitopes with the default threshold of 0.5. The server yielded a total of 30 epitopes predicted to show high antigenic score, non-allergenicity, and non-toxicity. The repertoires of CTL, HTL, and LBL epitopes were utilized for further analysis to select final candidates for vaccine construction.
Transmembrane and signal peptide screening. CTL, HTL, and LBL predicted epitopes obtained from previous experiments were further screened for transmembrane region and signal peptide by utilizing DeepTMHMM and SignalP 6.0. DeepTMHMM was used to predict transmembrane regions of epitopes which should be removed to avoid hydrophobic effect that is prone to aggregate, and thus leading to low-yielding and unstable protein synthesis 35–37 (Supplementary Dataset 1). CTL, HTL, and LBL epitopes demonstrating potential transmembrane region were retracted from vaccine construction. Our CTL, HTL, and LBL epitopes contained no transmembrane region. Finally, SignalP 6.0 was used to predict the signal peptide regions of CTL, HTL, and LBL epitopes and no signal peptides were found in our CTL, HTL, and LBL epitopes.
Epitope conservancy for the board spectrum coverage of vaccine over different clades. To generate broad spectrum vaccine against various clades of Monkeypox virus, all predicted CTL, HTL, and LBL epitopes were analyzed for the epitope conservancy across 7 clades of Monkeypox virus. Nearly all epitopes could cover Clade I (Zaire-96-l-16), Clade IIa (USA 2003), Clade llb-A (Nigeria 2018), Clade IIb-A.2 (USA 2002), Clade IIb-A.1 (UK P2), Clade IIb-A.1.1 (USA2021), Clade IIb-B.1(France 2022) with 100% of conservancy. However, a HTL epitope derived from F8L protein (SMVFEYRASTVIKGP) showed only a single amino acid variation from valine to isoleucine. One valine in Clade I Congo_2003_358 and Zaire-96-l-16 was replaced by isoleucine in Clade IIa (USA 2003), Clade llb-A (Nigeria 2018), Clade IIb-A.2 (USA 2002), Clade IIb-A.1 (UK P2), Clade IIb-A.1.1 (USA2021), and Clade IIb-B.1(France 2022). In other words, SMVFEYRASTVIKGP in Clade I became SMVFEYRASTIIKGP in Clade II (Supplementary S1 Table). Considering amino acid property, both valine and isoleucine are categorized in hydrophobic amino acid group, and thus valine to isoleucine substitution should produce negligible effect to the epitope.
Epitopes homologous analysis with human proteome. To prevent autoimmune induced by epitopes, Peptide Search server from Uniprot database was employed to investigate epitopes which showed homologous sequences to human proteome. According to Peptide Match tool, predicted epitopes were checked against homo sapiens proteome (Taxonomy ID: 9606). The results found that all epitopes were not homologous to the human proteome.
Final epitopes selection and worldwide population coverage. Epitopes showing non-homologous sequence to human proteome were further selected for the vaccine construct. To ensure broad HLA coverage, epitopes were chosen to target different alleles for both CTL and HTL. These candidates were selectively recruited, based on HLA alleles coverage for CTL and HTL epitopes, and high antigenicity for all three types (CTL, HTL, LBL). Following this method, 9 CTL, 4 HTL, and 4 LBL epitopes were selected as illustrated in Table 2A, 2B and 2C.
Afterwards, those selected CTL and HTL epitopes were analyzed by IEDB population coverage tool. The distribution of HLA alleles targeted by predicted MHC class I and II epitopes combined. The results showed that they could cover 94.87% of world population across 115 nations and 21 ethnicities within 16 geographic regions as shown in Fig. 1.
Multi-epitope subunit vaccine construct. The design of vaccine construct was formulated by combination of 5 elements, including adjuvant, 9 CTL epitopes, 4 HTL epitopes, 4 LBL epitopes, and 6xHis tag, with the help of linkers, as illustrated in Fig. 2. Linkers have been reported to provide stability to vaccine construct and prevent cross-interruption of each epitope, thus rendering them to independently function upon vaccination 38,39. Five different adjuvants (Cholera toxin B subunit (CTB), PADRE, Beta-defensin 3, 50S ribosomal protein L7/12, and RS-09) were used in this study, resulting in five vaccine constructs. Each adjuvant was N- terminally fused to the vaccine by using EAAAK linker. The CTL epitopes were linked with AAY linkers and GPGPG linkers were used to connect the HTL epitopes. The B-cell epitopes were linked with the help of KK linkers. HEYGAEALERAG linkers were used to connect the first HTL epitope and the last CTL epitope as well linking the first B-cell-epitopes the last HTL epitope. Additionally, 6xHis tag was fused to the C-terminal of the vaccine construct by the RVRR linker for vaccine purification by affinity chromatography and immunodetection of vaccine expression. The schematic of each vaccine construct was depicted in Fig. 2. All five vaccine constructs were subjected to be predicted for antigenicity, allergenicity, toxicity, solubility, the presence of signal peptide, and physicochemical properties compared to a control vaccine without adjuvant.
Antigenicity, Allergenicity, Toxicity and Physicochemical properties. Five vaccines constructs were evaluated and compared for antigenicity, allergenicity, toxicity, and physicochemical properties as shown in Table 3. All five vaccine constructs demonstrated antigenicity score higher than 0.50 calculated by Vaxijen 2.0 server which indicated robust antigenic property of all multi-epitope vaccine constructs. Additionally, four vaccines were immunogenic and non-toxic; however, the vaccine with beta-defensin adjuvant was toxic. The physicochemical properties predicted from ProtParam server showed the molecular weights of all multi-epitope vaccine constructs were in range from ~29 to ~40 kDa. The theoretical isoelectric point (PI) values of all constructs ranged from 5.46 to 8.71. The estimated half-life in in-vitro mammalian reticulocytes was 30 hours, >20 hours in in-vivo yeasts, and >10 hours in in-vivo E. coli. The instability index of Cholera toxin B-vaccine, PADRE-vaccine, and 50S Ribosomal protein L7/12 ranged from ~28 to ~35 which indicated these constructs were stable at various temperatures 40. However, the instability index of RS09-vaccine and beta-defensin-vaccine was higher than 65, indicating their instability property 40. Gravy scores of all vaccines ranged from -0.427 to -0.699, showing hydrophilic nature 40,41. In line with Gravy scores, solubility scores calculated by SOLpro server were in range of ~0.72 to ~0.98, indicating high solubility 42. Aliphatic index scores of Cholera-vaccine, PADRE-vaccine, RS09-vaccine, and 50S Ribosomal protein L7/12 were in range of 65.88 to 77.81, representing high thermostability whereas the aliphatic index of beta-defensin-vaccine was 35.21, indicating low thermostability 43. Finally, vaccine was also tested for the signal peptide sequences; however, there was no signal peptide sequence found within all tested vaccines. Finally, vaccine was also tested for the signal peptide sequences; however, there was no signal peptide sequence found within all tested vaccines. Collectively, based on the all above predictions, PADRE-vaccine demonstrated overall improvement in terms of antigenicity, solubility, and stability without compromised allergenicity and toxicity. Thus, PADRE-vaccine was recognized as the lead vaccine construct for immunological translation and chosen for further analysis.
Secondary structure prediction, three-dimensional Structure Predictions, refinement, and validation. PADRE-vaccine showed the overall improvement in both immunological properties and physicochemical properties; therefore, it was used to predict secondary structure and three-dimensional structure. The secondary structure of vaccine was predicted by using PSIPRED 4.0 server to identify the features. The predicted secondary structure resulted in 44.6% Coils, 33.7% α-Helix, and 21.8% β-Strands as shown in Fig. 3. Additionally, the three-dimensional structure of PADRE-vaccine was predicted using Alphafold 2. Five there-dimensional structure models were generated by Alphafold 2, and the model with the highest pLDDT score was selected for further analysis. To improve the accuracy of the predicted three-dimensional structure model, the lead model obtained from Alphafold 2 was further refined by GalaxyWEB server for the refinement, resulting five refined models with improved quality demonstrated by quality assessment parameters as shown in Table 4. Model 2 was selected based on improved Molprobity and Rama favored scores of 1.055 and 98.6, respectively, which indicated the high quality of the model. PyMol was used to visualize the three-dimensional model of the initial and refined three-dimensional structure models (Fig. 4).
Model 2 was further analyzed for validation with SAVES 6.0 server by ERRAT and PROCHECK. The selected model exhibited a high quality score with 86.9281% according to ERRAT. According to PROCHECK for evaluation of Ramachandran plot, there were 97.5% residues found in preferred region, 1.6% in allowed region, 0.4% in generously allowed regions and only 0.4% in forbidden regions as shown in Fig. 5a. Additionally, ProSa-Web server was used to compute the z-score of the model. The z-score calculated by ProSa-web server was -4.62 as shown in Fig. 5b. Collectively, the results from ERRAT, PROCHECK, and ProSa-web represented excellent quality of the predicted three-dimensional model.
Molecular docking of TLRs and selected vaccine construct. The ClusPro 2.0 server was used to predict the binding affinity between PADRE-vaccine and TLR4. The results, which included cluster size, center energy score, and lowest energy score. Out of thirty best fit docking models, the models with the largest clusters size model for the complex were prioritized for further analysis. This model showed the score of -1375.4 for both center energy score and lowest energy as shown in Table 4. The lead model obtained from the ClusPro 2.0 server was further refined via REFINEMENT tool on the HADDOCK 2.4 server. The refined model was analyzed for Gibb’s free energy(ΔG) and dissociation constant (KD) as well as other aspects including HADDOCK score, Cluster size, RMSD from the overall lowest-energy structure, Van der Waals energy, electrostatic energy, desolvation energy, restraints violation energy, buried surface area, z-score as shown in the Table 4. The predicted model of vaccine-TLR4 complex was selected. Binding energy (ΔG) of this model was -18.7 kcal/mol and dissociation constant (KD) was 2.00 x 10-14 M. With Van der Waals energy, electrostatic energy, desolvation energy, restraints violation energy, and buried surface area, these parameters contributed HADDOCK score of -808.4 +/- 7.9 for vaccine-TLR4 complex. Additionally, the cluster size of the complex was 20. Root Mean Square Deviation (RMSD) from the overall lowest-energy structure was 0.7 +/- 0.4 Moreover, a z-score of 0 showed that this model was exactly on the average score within the cluster. Overall, these parameters indicated that the selected model was stable and energetically favorable interaction.
In silico immune stimulation. C-IMMSIM v10.1 server was used to predict the immunization of candidate vaccine. The results from stimulation using the candidate vaccine included B-cell, helper T-cell, cytotoxic T-cell, natural killer cell (NK), macrophage (MA), dendritic cell (DC), and certain cytokines. PADRE-vaccine could trigger IgM and IgG production since primary immunization and dramatically elevated the production of IgM, IgG1, and IgG2 after the secondary and tertiary response as shown in Fig. 6a and 6c. In line with the elevated antibody titer, total B-cell populations, and particularly B isotype IgM and memory B-cell were induced since the first injection (Fig. 6b). Those total B-cell, IgM, memory B-cell, and IgG1 populations were elevated dramatically, while B isotype IgG2 could also be induced after the second and third injections as shown in Fig. 6b. Noticeably, a decrease in the antigen was detected as shown in Fig. 5a, and these phenomena resulted from the generation of memory B-cell population (Fig 6b). Also, Fig. 6c indicated that plasma B lymphocyte (isotype: IgM+IgG) could be noticeably elevated. The active B-cell population per state increased significantly as well as became consistently high after the immunization as demonstrated by Fig. 6d. According to Fig. 6e, the total help T-cell population was increased significantly from first, second and third injections. Particularly, both memory and non-memory helper T-cell were dramatically elevated. Likewise, active, and resting helper T-cell population per state could be elevated significantly as indicated by Fig. 6f. Although resting cytotoxic T-cell population per state went down after the first immune response, it dramatically elevated over time after the tertiary immune response as shown in Fig. 6g. Total NK cell, total MA population per state, especially resting MA, and DC population per state (for both active and resting) were also generated by the immunization as demonstrated by Fig. 6H, 8I, and 8J. Regarding Fig. 6K, the titer of interferon-γ (IFN-g) was induced significantly by all three injections. The increased transforming growth factor-β (TGF- β), interleukin 10 (IL10) and interleukin 12(IL 12) were observed. The unelevated danger level D on the same figure indicated that risk was extremely low. This represented a favorable reaction to immunization.
Codon optimization and in silico cloning. Codon optimization was performed using Java Codon Adaptation Tool (JCAT) for protein expression in E. coli. The protein sequence of PADRE-vaccine was reverse-translated to DNA sequence. The DNA sequence was submitted to JCAT server for codon optimization. The Codon Adaptation Index (CAI) and GC content of the codon optimized sequence was 1.0 and 51.11% (Supplementary Dataset 2), respectively. These scores indicated the optimized nucleotide sequence was found within the range of optimum CAI (0.8 - 1.0) and GC Content (30 - 70). Finally, the optimized nucleotide sequence above was in-silico cloned by inserting into pET21a vector digested with NdeI and XhoI restriction enzymes using SnapGene 7.2.1, resulting a recombinant plasmid (pET-21a(+)-PADRE Mpox vaccine-6xHis tag) shown in Fig. 7.
NMA via iMODS server. The iMODS server and GROMACS software were utilized to perform molecular dynamic simulation (MD). iMODS was employed to analyze the vaccine-TLR complexes using normal mode analysis (NMA) to investigate the binding affinity of the vaccine-TLR4 complex and to investigate the stability and dynamic movements over time. The main-chain deformity was performed by measuring the capability of molecular deformation at each residue. High peaks represented the deformable regions of the vaccine-TLR complex, indicating flexible locations, while rigid regions were indicated by low values, as shown in Fig. 8. According to Fig. 8a, the vaccine-TLR4 complex demonstrated low distortion.
B-factor charts, shown in Fig 8b, indicated the average RMSD values of the vaccine-TLR complex by relating the mobility of computed complex in NMA and to the corresponding PDB field. The eigenvalues of the docked, illustrated in Fig. 8c, explained the motion stiffness. The required energy value for structural deformation was indicated by the eigenvalue, with the lower eigenvalue representing easier deformation. The eigenvalue of the interaction between vaccine and TLR4 6.239649 X 10-7. According to Fig. 8d, the graphs of variance, which showed an inverse association with the eigenvalue, were shown. The purple color represented individual variance, while the green color represented cumulative variance for each normal mode of the vaccine-TLR complex. The covariance map of the vaccine-TLR complex illustrated coupling between two molecules of the complex (Fig. 8e). Red color represented correlated atomic motions, white color represented uncorrelated atomic motion, and blue color represented anticorrelated atomic motion in the regions of MD depicted in Fig. 8e. The elastic network map of vaccine-TLR4 complex shown in Fig. 8f depicted each gray dot as the formation of springs connecting corresponding pair of atoms. Flexible regions were represented by lighter dots, whereas stiffer regions were indicated by darker dots. Collectively, all these results highlighted the stable interaction between vaccine and TLR4.
MD analysis via GROMACS. To measure the stability of vaccine-TLR4 complex, the backbone atom RMSD was computed from initial point of time. According to Fig. 9a, the RMSD of the complex rapidly increased from the beginning to approximately 5 ns. After that, the structure stabilized until around 20 ns, with the range of approximately 0.6 to 0.5 nm. However, there was an increase after 20 ns until it became substantially stabilized after approximately 24 ns with the range of around 0.6 - 0.8 nm throughout the remaining simulation time. Additionally, RMSF was analyzed to assess the flexibility of vaccine-TLR complex. The high fluctuation of RMSF indicated highly flexible regions of the complex. The peak of fluctuation was 2.1589 nm of residue 636, as shown in Fig. 9b. Several highly fluctuation regions were identified, including the region between residue 409 – 519 (with the range of 0.8203 to 1.287nm), the region between residues 996 - 1222 (with the range of 0.7031 -1.2779 nm), the region between residue 628 – 653 (with the range of 1.0358 – 2.1589 nm), and the region between residue 1267 – 1558 with a very wide range of fluctuations between 0.2887 – 1.5996 nm. The last region contained several high and highest fluctuations residues, such as residue 1267 with 1.5996 nm, residue 1414, with 1.4612 nm, and residue 1558 with 1.5258 nm.
Rg was analyzed to determine the structural compactness of vaccine-TLR complex. As illustrated in Fig. 9c, the overall Rg was substantially stable throughout the simulation time. Furthermore, the stability of the complex could be measured with the number of hydrogen bonds. According to Fig. 9d, the result of the MD analysis illustrated the stability of the number of hydrogen bonds with a very narrow range between 10 and 11 throughout the simulation time.
Finally, the SASA could indicate the surface area of a vaccine-TLR complex exposed to solvent molecules. The results could imply the flexibility and amenability of protein-protein interaction. Illustrated by Fig. 9e, there was a fluctuation from the beginning of simulation time to around 20 ns, with the range of ~760 to ~720 nM2. After 20 ns until 34 ns, SASA became more stable. Then, there was a sudden drop at around 36 ns; however, SASA maintained it stability during ~36 to 42 ns. Eventually, it increased again at simulation time ~43 ns, after that SASA became stable until the end of simulation time.