Monkeypox viral protein retrieval and screening Monkeypox protein sequences, including structural and nonstructural proteins retrieved from the MPXV strain Congo_2003_358 in the NCBI database (DQ011154.1), were predicted for antigenicity and allergenicity via VaxiJen 2.0 and AllergenFP1.0. From the 200 proteins retrieved from the completed genome, only 24 proteins were selected based on non-allergenicity and antigenicity, with a score higher than 0.4, as demonstrated in Table 1A and 1B. These 24 selected proteins were further utilized for epitope prediction.
Prediction of epitope candidates and assessment of allergenicity, antigenicity, and toxicity parameters.The NetMHCpan 4.1 BA and NetMHCpan 4.1 EL servers predicted 35 CTL epitopes that strongly bind to MHC class I for HLA class I alleles (HLA-A0101, HLA-A0201, HLA-A0301, HLA-A2402, HLA-A2601, HLA-B0702, HLA-B0801, HLA-B2705, HLA-B3901, HLA-B4001, and HLA-B5801). All 35 selected CTL epitopes demonstrated a positive immunogenicity score via the IEDB Class I immunogenicity tool, antigenicity with a score higher than 0.4, non-allergenicity, and non-toxicity.
HTL epitopes were predicted through NetMHCpanII 4.1 BA and NetMHCpanII 4.1 EL for MHC class II. These servers predicted 24 potential HTL epitopes with strong binding affinities against MHC class II based on the HLA alleles DRB1_0101, DRB1_0301, DRB1_0401, DRB1_0701, DRB1_0801, DRB1_0901, DRB1_1001, DRB1_1101, DRB1_1201, DRB1_1301, DRB1_1501, and DRB1_1602. Additionally, these HTL epitope candidates were tested with the IFNepitope server to selectively filter those sequences that can stimulate IFN-γ. All 24 HTL epitopes could trigger IFN-γ secretion. Out of the 24 epitopes, only one epitope had a low antigenicity score (less than 0.4); thus, 23 of the remaining HTL epitopes had high antigenicity, no allergenicity, and no toxicity.
In the case of LBL, the BepiPred linear epitope prediction 2.0 server was used to predict the LBL epitopes with the default threshold of 0.5. The server yielded a total of 30 epitopes predicted to have high antigenic scores, no allergenicity, and no toxicity. The CTL, HTL, and LBL epitope repertoires were utilized for further analysis to select final candidates for vaccine construction.
Transmembrane and signal peptide screening The CTL, HTL, and LBL predicted epitopes obtained from previous experiments were further screened for transmembrane regions and signal peptides by utilizing DeepTMHMM and SignalP 6.0. DeepTMHMM was used to predict the transmembrane regions of epitopes that should be removed to avoid hydrophobic effects that are prone to aggregation, thus leading to low yields and unstable protein synthesis 35–37 (Supplementary Dataset 1). The CTL, HTL, and LBL epitopes demonstrating potential transmembrane regions were retracted from the constructed vaccines. Our CTL, HTL, and LBL epitopes contained no transmembrane region. Finally, SignalP 6.0 was used to predict the signal peptide regions of the CTL, HTL, and LBL epitopes, and no signal peptides were found in our CTL, HTL, or LBL epitopes.
Epitope conservancy for the board spectrum coverage of vaccines across different clades. To generate broad-spectrum vaccines against various clades of Monkeypox virus, all the predicted CTL, HTL, and LBL epitopes were analyzed for epitope conservation across 7 clades of Monkeypox virus. Nearly all the epitopes covered Clade I (Zaire-96-l-16), Clade IIa (USA 2003), Clade llb-A (Nigeria 2018), Clade IIb-A.2 (USA 2002), Clade IIb-A.1 (UK P2), Clade IIb-A.1.1 (USA2021), and Clade IIb-B.1 (France 2022), with 100% conservancy. However, an HTL epitope derived from the F8L protein (SMVFEYRASTVIKGP) showed only a single amino acid variation from valine to isoleucine. One valine in Clade I, Congo_2003_358 and Zaire-96-l-16 was replaced by an isoleucine in Clade IIa (USA 2003), Clade llb-A (Nigeria 2018), Clade IIb-A.2 (USA 2002), Clade IIb-A.1 (UK P2), Clade IIb-A.1.1 (USA2021), and Clade IIb-B.1 (France 2022). In other words, the SMVFEYRASTVIKGP in Clade I became the SMVFEYRASTIIKGP in Clade II (Supplementary Table S1). Considering the amino acid properties, both valine and isoleucine are categorized as hydrophobic amino acid groups; thus, the valine to isoleucine substitution should have a negligible effect on the epitope.
Epitope homology analysis with the human proteome. To prevent autoimmune induction by epitopes, the Peptide Search server from the UniProt database was used to investigate epitopes that showed homologous sequences to the human proteome. According to the peptide match tool, the predicted epitopes were checked against the Homo sapiens proteome (Taxonomy ID: 9606). The results showed that all the epitopes were not homologous to the human proteome.
Final epitope selection and worldwide population coverage Epitopes showing nonhomologous sequences to the human proteome were further selected for the vaccine construct. To ensure broad HLA coverage, epitopes were chosen to target different alleles for both CTL and HTL. These candidates were selectively recruited based on HLA allele coverage for CTL and HTL epitopes and high antigenicity for all three types (CTL, HTL, LBL). Following this method, 9 CTL, 4 HTL, and 4 LBL epitopes were selected, as illustrated in Table 2A, 2B and 2C.
Afterwards, the selected CTL and HTL epitopes were analyzed via the IEDB population coverage tool. The distribution of HLA alleles targeted by the combination of the predicted MHC class I and II epitopes. The results showed that 94.87% of the world population across 115 nations and 21 ethnicities were covered within 16 geographic regions, as shown in Fig. 1.
Multi-epitope subunit vaccine construct. The design of the vaccine construct was formulated by combining 5 elements, including an adjuvant, 9 CTL epitopes, 4 HTL epitopes, 4 LBL epitopes, and a 6xHis tag, with the help of linkers, as illustrated in Fig. 2. Linkers have been reported to provide stability to vaccine constructs and prevent cross-interruption of each epitope, thus allowing them to independently function upon vaccination 38,39. Five different adjuvants (cholera toxin B subunit (CTB), PADRE, beta-defensin 3, 50S ribosomal protein L7/12, and RS-09) were used in this study, resulting in five vaccine constructs. The adjuvants were N-terminally fused to the vaccine by using the EAAAK linker. The CTL epitopes were linked with AAY linkers, and GPGPG linkers were used to connect the HTL epitopes. The B-cell epitopes were linked with the help of KK linkers. HEYGAEALERAG linkers were used to connect the first HTL epitope and the last CTL epitope as well as to link the first B-cell epitope to the last HTL epitope. Additionally, the 6xHis tag was fused to the C-terminus of the vaccine construct by the RVRR linker for vaccine purification by affinity chromatography and immunodetection of vaccine expression. A schematic of each vaccine construct is depicted in Fig. 2. The antigenicity, allergenicity, toxicity, solubility, presence of signal peptide, and physicochemical properties of all five vaccine constructs were predicted and compared to those of a control vaccine without an adjuvant.
Antigenicity, allergenicity, toxicity and physicochemical properties The five vaccine constructs were evaluated and compared for antigenicity, allergenicity, toxicity, and physicochemical properties, as shown in Table 3. All five vaccine constructs demonstrated antigenicity scores higher than 0.50 according to the Vaxijen 2.0 server, which indicated the robust antigenic properties of all the multi-epitope vaccine constructs. Additionally, four vaccines were immunogenic and nontoxic; however, the beta-defensin adjuvant was toxic. The physicochemical properties predicted from the ProtParam server showed that the molecular weights of all the multi-epitope vaccine constructs ranged from ~29 to ~40 kDa. The theoretical isoelectric points (PIs) of all the constructs ranged from 5.46 to 8.71. The estimated half-lives of in vitro mammalian reticulocytes were 30 hours, >20 hours in in vivo yeasts, and >10 hours in in vivo E. coli. The instability indices of the cholera toxin B-vaccine, PADRE-vaccine, and 50S ribosomal protein L7/12 ranged from ~28 to ~35, which indicated that these constructs were stable at various temperatures 40. However, the instability indices of the RS09-vaccine and beta-defensin-vaccine strains were greater than 65, indicating their instability 40. The gravy scores of all the vaccines ranged from -0.427 to -0.699, indicating a hydrophilic nature 40,41. In line with the Gravy scores, the solubility scores calculated by the SOLpro server ranged from ~0.72 to ~0.98, indicating high solubility 42. The aliphatic indices of cholera-vaccine, PADRE-vaccine, the RS09-vaccine, and the 50S ribosomal protein L7/12 ranged from 65.88 to 77.81, indicating high thermostability, whereas the aliphatic index of the beta-defensin-vaccine was 35.21, indicating low thermostability 43. Finally, the signal peptide sequence of the vaccine was also tested; however, no signal peptide sequence was found within any of the tested vaccines. Finally, the signal peptide sequence of the vaccine was also tested; however, no signal peptide sequence was found within any of the tested vaccines. Collectively, based on the above predictions, the PADRE-vaccine demonstrated overall improvement in terms of antigenicity, solubility, and stability without compromising allergenicity or toxicity. Thus, the PADRE-vaccine was recognized as the lead vaccine construct for immunological translation and was chosen for further analysis.
Secondary structure prediction, three-dimensional structure predictions, refinement, and validation The PADRE-vaccine showed overall improvements in both immunological and physicochemical properties; therefore, it was used to predict secondary and three-dimensional structures. The secondary structure of the vaccine was predicted by using the PSIPRED 4.0 server to identify features. The predicted secondary structure included 44.6% coils, 33.7% α-helices, and 21.8% β-strands, as shown in Fig. 3. Additionally, the three-dimensional structure of the PADRE-vaccine was predicted using Alphafold 2. Five three-dimensional structure models were generated by Alphafold 2, and the model with the highest pLDDT was selected for further analysis. To improve the accuracy of the predicted three-dimensional structure model, the lead model obtained from Alphafold 2 was further refined by the GalaxyWEB server, resulting in five refined models with improved quality demonstrated by quality assessment parameters, as shown in Table 4. Model 2 was selected based on improved Molprobity and Rama favored scores of 1.055 and 98.6, respectively, which indicated the high quality of the model. Pymol was used to visualize the 3D model of the vaccine construct (Fig. 4).
Model 2 was further analyzed for validation with the SAVES 6.0 server by ERRAT and PROCHECK. The selected model exhibited a high quality score of 86.9281% according to the ERRAT. According to the plot of PROCHECK for evaluation via Ramachandran, 97.5% of the residues were found in the preferred region, 1.6% were found in the allowed region, 0.4% were found in generously allowed regions, and only 0.4% were found in forbidden regions, as shown in Fig. 5a. Additionally, the ProSa-Web server was used to compute the z-score of the model. The z-score calculated by the ProSa-web server was -4.62, as shown in Fig. 5b. Collectively, the results from ERRAT, PROCHECK, and ProSa-Web indicated the excellent quality of the predicted three-dimensional model.
Molecular docking of TLRs and selected vaccine constructs. The ClusPro 2.0 server was used to predict the binding affinity between the PADRE-vaccine and TLR4. The results included cluster size, center energy score, and lowest energy score. Among the thirty best-fit docking models, the model with the largest cluster size was prioritized for further analysis. This model yielded a score of -1375.4 for both the CES and lowest energy consumption, as shown in Table 4. The lead model obtained from the ClusPro 2.0 server was further refined via the REFINEMENT tool on the HADDOCK 2.4 server. The refined model was analyzed for Gibb’s free energy (ΔG) and dissociation constant (KD), as well as other aspects, including the HADDOCK score, cluster size, RMSD from the overall lowest-energy structure, van der Waals energy, electrostatic energy, desolvation energy, restraint violation energy, buried surface area, and z score, as shown in Table 4. The predicted model of the vaccine-TLR4 complex was selected. The binding energy (ΔG) of this model was -18.7 kcal/mol, and the dissociation constant (KD) was 2.00 × 10-14 M. With respect to the van der Waals energy, electrostatic energy, desolvation energy, restraint violation energy, and buried surface area, these parameters contributed to an HADDOCK score of -808.4 +/- 7.9 for the vaccine-TLR4 complex. Additionally, the cluster size of the complex was 20. The root mean square deviation (RMSD) from the overall lowest-energy structure was 0.7 +/- 0.4. Moreover, a z score of 0 showed that this model was exactly the average score within the cluster. Overall, these parameters indicated that the selected model was stable and energetically favorable.
In silico immune stimulation The C-IMMSIM v10.1 server was used to predict the immunization efficacy of the candidate vaccine. The results from stimulation using the candidate vaccine included B cells, helper T cells, cytotoxic T cells, natural killer (NK) cells, macrophages (MAs), dendritic cells (DCs), and certain cytokines. The PADRE-vaccine triggered IgM and IgG production after primary immunization and dramatically elevated the production of IgM, IgG1, and IgG2 after the secondary and tertiary responses, as shown in Fig. 6a and 6c. In line with the elevated antibody titer, total B-cell populations, particularly B isotype IgM and memory B-cell populations, were induced after the first injection (Fig. 6b). The total B cell, IgM, memory B cell, and IgG1 populations were dramatically elevated, while the B isotype IgG2 was also induced after the second and third injections, as shown in Fig. 6b. Noticeably, a decrease in the antigen concentration was detected, as shown in Fig. 5a, and these phenomena resulted from the generation of a memory B-cell population (Fig. 6b). Additionally, Fig. 6c indicates that the plasma B lymphocyte count (isotype: IgM+IgG) was noticeably elevated. The active B-cell population per state increased significantly and became consistently high after immunization, as demonstrated in Fig. 6d. According to Fig. 6e, the total help T-cell population increased significantly after the first, second and third injections. In particular, both memory and nonmemory helper T cells were dramatically elevated. Likewise, the active and resting helper T-cell populations per state could be elevated significantly, as indicated by Fig. 6f. Although the resting cytotoxic T-cell population per state decreased after the first immune response, it dramatically increased over time after the tertiary immune response, as shown in Fig. 6g. Total NK cells, the total MA population per state, especially the resting MA population, and the DC population per state (for both active and resting states) were also generated by immunization, as demonstrated in Fig. 6H, 8I, and 8J. As shown in Fig. 6K, the titer of interferon-γ (IFN-γ) was significantly increased by all three injections. Increased transforming growth factor-β (TGF-β), interleukin 10 (IL10) and interleukin 12 (IL12) levels were observed. An elevated danger level D in the same figure indicates that the risk is extremely low. This represented a favorable reaction to immunization.
Codon optimization and in silico cloning Codon optimization was performed using the Java Codon Adaptation Tool (JCAT) for protein expression in E. coli. The protein sequence of the PADRE-vaccine was reverse-translated to the DNA sequence. The DNA sequence was submitted to the JCAT server for codon optimization. The codon adaptation index (CAI) and GC content of the codon-optimized sequence were 1.0 and 51.11%, respectively (Supplementary Dataset 2). These scores indicated that the optimized nucleotide sequence was within the range of the optimum CAI (0.8 - 1.0) and GC content (30 - 70). Finally, the optimized nucleotide sequence above was in silico cloned by insertion into the pET21a vector digested with the NdeI and XhoI restriction enzymes using SnapGene 7.2.1, resulting in the recombinant plasmid pET-21a(+)-PADRE Mpox vaccine-6xHis tag, as shown in Fig. 7.
NMA via the iMODS server. The iMODS server and GROMACS software were utilized to perform molecular dynamic (MD) simulations. iMODS was used to analyze the vaccine-TLR complexes via normal mode analysis (NMA) to investigate the binding affinity of the vaccine-TLR4 complex and to investigate its stability and dynamic movements over time. The main-chain deformity was determined by measuring the extent of molecular deformation at each residue. High peaks represented the deformable regions of the vaccine-TLR complex, indicating flexible locations, while rigid regions were indicated by low values, as shown in Fig. 8. According to Fig. 8a, the vaccine-TLR4 complex demonstrated low distortion.
B-factor charts, shown in Fig. 8b, indicate the average RMSD values of the vaccine-TLR complex obtained by relating the mobility of the computed complex in NMA to the corresponding PDB field. The eigenvalues of the docked materials, illustrated in Fig. 8c, explained the motion stiffness. The required energy for structural deformation was indicated by the eigenvalue, with a lower eigenvalue representing easier deformation. The eigenvalue of the interaction between the vaccine and TLR4 was 6.239649 X 10-7. According to Fig. 8d, the graphs of variance, which showed an inverse association with the eigenvalue, are shown. The purple color represents individual variance, while the green color represents cumulative variance for each normal mode of the vaccine-TLR complex. The covariance map of the vaccine-TLR complex illustrated coupling between two molecules in the complex (Fig. 8e). Red represents correlated atomic motions, white represents uncorrelated atomic motion, and blue represents anticorrelated atomic motion in the regions of MD depicted in Fig. 8e. The elastic network map of the vaccine-TLR4 complex shown in Fig. 8f depicts each gray dot as the formation of springs connecting the corresponding pair of atoms. Flexible regions are represented by lighter dots, whereas stiffer regions are indicated by darker dots. Collectively, these results highlighted the stable interaction between the vaccine and TLR4.
MD analysis via GROMACS. To measure the stability of the vaccine-TLR4 complex, the backbone atom RMSD was computed from the initial point in time. According to Fig. 9a, the RMSD of the complex rapidly increased from the beginning to approximately 5 ns. After that, the structure stabilized at approximately 20 ns, within the range of approximately 0.6 to 0.5 nm. However, there was an increase after 20 ns until it became substantially stable after approximately 24 ns, with a range of approximately 0.6-0.8 nm throughout the remaining simulation time. Additionally, the RMSF was analyzed to assess the flexibility of the vaccine-TLR complex. The high fluctuation in the RMSF indicated highly flexible regions of the complex. The peak corresponding to residue 636 was at 2.1589 nm, as shown in Fig. 9b. Several highly fluctuating regions were identified, including the region between residues 409 and 519 (with a range of 0.8203 to 1.287 nm), the region between residues 996 and 1222 (with a range of 0.7031 to 1.2779 nm), the region between residues 628 and 653 (with a range of 1.0358 to 2.1589 nm), and the region between residues 1267 and 1558, with a very wide range of fluctuations between 0.2887 and 1.5996 nm. The last region contained several highly fluctuating residues, such as residue 1267 (with a length of 1.5996 nm), residue 1414 (with a length of 1.4612 nm), and residue 1558 (with a length of 1.5258 nm).
Rg was analyzed to determine the structural compactness of the vaccine-TLR complex. As illustrated in Fig. 9c, the overall Rg was substantially stable throughout the simulation. Furthermore, the stability of the complex could be measured by the number of hydrogen bonds. According to Fig. 9d, the results of the MD analysis illustrated the stability of the number of hydrogen bonds within a very narrow range between 10 and 11 throughout the simulation time.
Finally, the SASA could indicate the surface area of a vaccine-TLR complex exposed to solvent molecules. These results could imply the flexibility and amenability of protein‒protein interactions. As illustrated in Fig. 9e, there was a fluctuation from the beginning of the simulation to approximately 20 ns, with a range of ~760 to ~720 nM2. After 20 ns until 34 ns, the SASA became more stable. Then, there was a sudden drop at approximately 36 ns; however, the SASA maintained its stability from ~36 to 42 ns. Eventually, it increased again at a simulation time of ~43 ns, after which the SASA became stable until the end of the simulation time.