Primary analysis of the retrieved sequences
In this study, spike (S) protein, nucleocapsid (N), envelope (E), membrane (M) protein, nsp3 and nsp8 antigens (totally 6 protein) were used for bioinformatics analysis to find CTL, HTL, IFN-γ and MHC I binder's epitopes. The highly conserved regions of β-defensin 2 were employed as an adjuvant because the vaccine requires a strong induction of T cell immune responses. The signal sequences were excluded from all constructed structures. The workflow of the designed work representing the overall procedures of vaccine design is shown in Fig. 1.
Cytotoxic T lymphocyte (CTL) prediction
CTL epitopes were selected according to the highest specificity and lowest sensitivity for their adaptive immune receptor. Top three high score epitopes were chosen from spike S, N, E, M, nsp3 and nsp8 antigens to get a total of 18 final CTL epitopes (Table 1). On the other hand, we utilized the Immune Epitope Database (IEDB) tool to predict MHC-I peptide binders to human alleles HLA-B*57:01, HLA-A*26:01, HLA-A*11:01, HLA-B*35:01, HLA-A*03:01, HLA-B*15:01, HLA-A*01:01, HLA-A*11:01, HLA-A*30:02, HLA-A*23:01, HLA-B*38:01, HLA-B*07:02, HLA-C*07:01, HLA-A*03:01, HLA-A*25:01, HLA-A*31:01 and HLA-A*29:02. In the end, CTL epitopes were analysed and IFN-γ inducing epitopes were applied in final construct.
Helper T-lymphocyte (HTL) epitope prediction
The HTL epitope prediction was done with the help of the IEDBonline server for human HLA-II alleles including HLA-DPA1*01:03, HLA-DPB1*02:01, HLA-DQA1*01:01, HLA-DQB1*05:01,HLA-DRB1*09:01, HLA-DRB1*13:02, HLA-DRB1*13:21, HLA-DRB1*04:26, HLA-DRB1*13:02, HLA-DRB1*01:01, HLA-DQA1*05:01, HLA-DQB1*02:01, HLA-DPB1*02:01, HLA-DRB1*09:01, HLA-DRB1*11:14, HLA-DQA1*01:02, HLA-DQB1*06:02, HLA-DPA1*03:01, HLA-DPB1*04:02, HLA-DPA1*03:01, HLA-DPB1*04:02, HLA-DRB1*07:03 and HLA-DRB1*01:01 for antigens. Finally, 18 epitopes were chosen from the predicted HTL epitopes according to their lowest percentile rank (Table 2).The lowest percentile rank reflecting their good binding affinity for the MHC-II receptor.
Construction of the subunit vaccine
To design the final vaccine construct, the top-scored epitopes of CTL and HTL were joined together using HEYGAEALERAG linker.To facilitate cell penetration of the proposed vaccine, the TAT peptide of HIV (GRKKRRQRRRPPQ) was added at the N-terminal of the construct. In addition, for increasing the immunogenicity of the vaccine, a 41 amino acid-long sequence of β-defensin 2 was added after TAT peptide as an adjuvant and then followed by the PADRE sequence (AGLFQRHGEGTKATVGEPV). The TAT peptide and β-defensin 2 were linked to the PADRE sequence throughGPGPG and EAAAK linkers, respectively. The order of the finalconstruct is summarized in Fig. 2.
Population coverage
The predictionof MHC-I and MHC-IIbased coverage of the applied epitopes in the constructed vaccine accomplished by IEDB analysis resources for the world population.As SARS-CoV-2 is global pandemic, hence, worldwide analysis has been accomplished. The world population coverage of MHC-I and MHC-II found to be 88.77%, and 97.91%, respectively, that showed in Fig. 3.
Optimized codon and mRNA structure of the chimeric vaccine
We use the JCAT online server to obtain the codon-optimized DNA sequence of the proposed vaccine to verify for cloning and expressing in E. coli strain K12 as a host. The optimized codon sequences had 2091 nucleotides long, GC-content was reached to 52.9 % and its codon adaptation index was raised to 0.98. The graphical illustration of the optimized gene was shown in Fig. 4A. In addition, the mfold server was used to get free energy related to the whole mRNA structure and its 5′ end. Its details are shown in Table 3. As shown in Fig. 3B, C, and D, the minimum free energy of the predicted secondary RNA structure was ΔG = -595.30 kcal/mol at 37 °C without hairpin nor pseudoknot at the 5′ side.
Evaluation of allergenicity, antigenicity, physiochemical parameters and solubility of the vaccine construct
Unveiling allergenicity of the proposed vaccine was a crucial consideration to avoid allergic reactions. The resulting output of the Algpred online server determined the non-allergenic behavior of the vaccine constructs with a score of -0.59 (Threshold=-0.4).Likewise, the antigenicity of the construct was predicted by the VaxiJen v2.0 online server, and it confirmed that the protein was a probable antigen with an overall prediction score of antigen 0.4996.Furthermore, the physicochemical parameters were evaluated, and the molecular weight of vaccine protein was found to be 74.8 kDa which is good to support the antigenic nature of the vaccine construct.The predicted instability index of the proposed vaccine was 39.09, which is below 40 hence, classifying the protein as stable.The isoelectric point (pI) of the construct was 9.30 and it shows the basic nature of the vaccine.
The estimated half-life (in-vitro) in mammalian reticulocytes was found to be 30 h, while invivo studies in Yeast and E. coli more than 20 and 10 h, respectively. In addition, the aliphatic index was 80.26, which determines the thermostable nature of the proposed vaccine, and the grand average of hydropathicity obtained to be 0.063, which proves the hydrophilic nature of the constructed vaccine.The value for the solubility tendency of the proposed vaccine upon overexpression in E. coli host was 0.55.
Secondary and tertiary structure of subunit vaccine
PSIPRED server was employed to predict the secondary structure of the proposed vaccine (Fig.5). This server showed that the vaccine is made of 43.76% alpha-helix, 16.93% extended strand, and 39.31% random coil.The tertiary structure was predicted as 3 domain structures using the galaxyWEB web server which is a template-based protein structure modeling web server that resulted in a modeled protein structure (Fig. 6). The 1FD3_C, 5MKE_A, and 5MKF_A were used as the best-suited template to model the tertiary structure. All 697 amino acid residues were modeled, while the 70-92, 681-688, and 690-697 amino acids of the residues were unreliable local regions.
Refinement and validation of the 3D structure
The generated 3D structure was submitted for the refinement process by the GalaxyRefine online server to enhance the quality of predicted 3D modeled structure beyond the accuracy.Furtheremore, the refined model from GalaxyRefine was analyzed by the ramachandran plot.Validation of 3D structure by ramachandran plot before refinement demonstrated 540(92%), 37 (6.3%), and 9(1.5%) of residues were placed in favored, additionally allowed and disallowed regions, respectively (Fig. 7A). In the refined model, 550 (93.7%), 26 (4.4%), and 7 (1.2%) of residues were placed in favored, additionally allowed and disallowed regions, respectively (Fig. 7B).
Prediction of intrinsic protein disorder
DisEMBL online server identified and annotated disordered regions. Amino acids in the input sequence are considered disordered by remark-465 definition in the following regions: 1-23, 464-478, 689-697.Based on IUPred results, amino acids were considered disordered when the confidence score is higher than 0.5. Disordered regions were also presented in Fig. 8.
Predicted B-cell epitopes
Ellipro software and the Discotope online server were utilized to predict the linear B-cell and discontinuous B cell epitopes, respectively. The linear B-cell and discontinuous B cell epitopes on the 3D structure are shown in Fig. 9A-G and Fig. 9H-L, respectively. Moreover, epitopes predicted in modeled vaccine according to various parameters were listed in Table 4.
Vaccine protein disulfide engineering
Disulfide engineering leads to an increase in vaccine stability with mutating the residues present in high mobility areas with cysteine. 76 residue pairs were found to have the ability of disulfide bond formation. But, when all these residue pairs were more analyzed on the other parameters like energy on chi3 value and B-factor, only 1 pairs of residues provided the disulfide bond formation. Those residue pairs were GLY360-SER368 and ASP580- GLY601 (Fig. 10). These two residues were mutated to the cysteine. The energy value analyzed for the residue screening was less than 2.2 and the value of chi3 was between -87 to +97.
Molecular docking of vaccine construct with the TLR-3
Protein-protein docking was done with the help of the Hdock online server between the final vaccine construct as a ligand and the TLR-3 as a receptor. We have procured 20 models and the model with the lowest binding energy was selected. The Docking score for the best acceptable model was -326.86, whereas the ligand RMSD (Å) was 205.96.PatchDock server was used to reaffirm the docking between the final vaccine construct and TLR-3. Top 20 models were provided by the server and the model with the top-ranked and lowest binding energy was chosen.The obtained output encompasses the score (17080), Area (2493.80), ACE (408.17), and transformation (-2.83 -0.57 -1.83 -49.29 3.75 160.90). The docked complex achieved from the server has been shown in Fig. 11A and B.
Molecular dynamics simulation results
MD simulation was used to analyze the physical movements of the atoms and molecules of the proposed vaccine. At the initial step of MD simulation, energy minimization was done and several parameters including pressure and temperature were measured.Regular mode investigation allowed the demonstration of the stability of proteins and the large scale mobility. The iMODS server performed such an investigation based on internal coordinates of the receptor-ligand complex (Fig. 12A). In 3D model, the direction of each residue was given through arrows and the length of the line represented the extent of mobility. The proposed vaccine protein and TLR3 receptor were oriented towards each other. The eigenvalue found for the complex was 4.953620e-08 (Fig. 12B). The B-factor values deduced from normal mode investigation was analogous to root mean square (Fig. 12C). Hinges in the chain indicated the feasible deformability of the complex calculated through the complication of each residue (Fig. 12D). The variance related to each normal mode was inversely related to the eigenvalue. Covariance matrix described the occlusion between pairs of residues was correlated, uncorrelated or anti-correlated motions were represented with red, white, and blue colors, respectively (Fig. 12 E).The result also provided an elastic network model (Fig. 12F) that identified the pairs of atom connected by springs. Each dot in the diagram was indicated one spring between the corresponding pair of atoms and colored based on the grade of rigidity.
In silico cloning
To examine the expression of the recommended vaccine into a vector, cloning was done by utilizing the SnapGene server.As the expression system of the human and E. coli K12 differ from each other and the codon requires the adaptation as per the host expression system, the codon sequence that was optimized by JCat was reverse transcribed and modified as per the E. coli K12. The restriction enzym sitesNot I and Xho I were introduced at the N and C-terminal, respectively. The adapted codon was entered into the pET21b (+) vector and a recombinant plasmid with the length of 7514 base pair was obtained as displayed in Fig. 13.
Immune simulation
C-ImmSim server results revealed a noticeable increase in the generation of immune responses. The humoral immune response was identified by high levels of IgM along with IgG + IgM (Fig. 14A). Estimation of the B cell population uncovered an increase in B memory cells and B-cell isotype IgM with a corresponding reduction in antigen concentration (Fig. 14B). In addition, the high response was observed in the TH1 and TC cell populations with corresponding memory development (Fig. 10C, D). This profile identified the development of immune memory and consequently enhanced clearance of the antigen upon subsequent exposures. In contrast, regulatory T cells were estimated in low level (Fig. 14E). Moreover, these predictions were consistent with the induced level of IFN-γ produced after immunization with the proposed vaccine (Fig. 14F).