Data retrieval, sequence alignment, and selection of proteins
A total of 3100 envelope protein sequences DENV-1, DENV-2, DENV-3, and 3149 polyprotein sequences of serotypes DENV-4 of dengue virus were collected from NCBI. Three individual envelope protein sequences from DNEV 1, DNEV 2, and DNEV 4, respectively, and 1 polyprotein from DNEV 3 were selected based on a high similarity sequence among them using MEGA 11 (7). NCBI protein blasts with the nr database were done for the identification of coverage worldwide (8).
Prediction of epitope
The Immune Epitope Database (IEDB) Linear Epitope Prediction Tool v2.0 was used to predict the B cell epitopes and T cell (MHC-I and MHC-II) epitopes from the selected protein sequences using default parameters (9). This tool is considered a specialized, high-quality, accurate, and powerful tool compared to others which only uses the epitope data from crystallized structures (10). The prediction method was where the method was Bepipred Linear Epitope Prediction 2.0 for B cell epitopes prediction and NetMHCpan 4.1 BA was for MHC-I and MHC-II epitopes. All available human allele (s) were selected (Supplementary Table 1).
Profiling and Selection of epitopes
The Antigenicity, allergen properties, toxicity properties, and homology analysis of epitopes were predicted by Vaxigen 2.0 server (11), AllerTOP (12), ToxinPred server (13), and PIR (Uniprot) server (14), respectively. B cell epitopes that have < 5 amino acids, < 0.4 antigenicity score and probable allergen were excluded from this study, and the highest two antigenicity-scored epitopes from each sequence were selected for vaccine construction (supplementary table 2). T cell epitope which has IC50 values > 100, < 0.4 antigenicity score, probable allergen, < 4 alleles for DNEV 1, DNEV 2, and DNEV 4 and < 5 alleles for DNEV 3 were excluded from this study and remaining top one or two epitopes based on interaction with highest alleles from each list were selected for vaccine construction (Supplementary table 2, Supplementary table 3 and Supplementary table 4).
Multi-epitope vaccine constructs design
All T-cell and B-cell epitopes who are successfully passed the filtration criteria were incorporated into the construction of the multi-epitope vaccine. Linkers were added for forming effective epitope conjugation between epitopes and allow independent immunological activities (15). Adjuvant molecules such as 50S ribosomal L7/L12 (16), LT-IIC (17), Cholera toxin-B (18), RS09 (19), CpG 1018 (20), PADRE (21), and Beta-defensin (22) were used for significant boosting the peptide vaccine’s immunogenicity and longevity (23). All vaccines were designed using standard scientific methods (24). All components of these constructs were identical except for the adjuvant. The general structures of vaccine construction with a study design are presented in Fig. 1.
Prediction of biophysical and biochemical features of vaccine constructs and selection of appropriate vaccine constructs
The Antigenicity, allergen properties, toxicity properties, and homology analysis of all vaccines were predicted by Vaxigen 2.0 server (11), AllerTOP (12), ToxinPred server (13), and PIR (Uniprot) server (14), respectively. Some vaccine constructs were excluded depending on their toxic nature. The remaining vaccine models were filtered based on low protein solubility score predicted by Soluprot v1.0 (25), presence of transmembrane regions (TMRs) analysis via best performing DeepTMHMM server (26), presence of signal peptidase and location of their cleavage site (Sec/SPI, Sec/SPII, Tat/SPI, Tat/SPII, and Sec/SPIII) examined by SignalP-6.0 (27). The physicochemical properties including molecular weight, the number of amino acids, theoretical isoelectric point (pI), estimated half-life, instability index, aliphatic index, and grand average of hydropathicity (GRAVY) index were predicted of finally selected vaccine constructs using ProtParam tool (28) (Table 1).
Secondary and tertiary structure prediction and validation of vaccine constructs
All vaccine constructs which were successfully passed the filtration criteria were submitted to the SIPRED 4.0 server to obtain their secondary structure because it performed protein secondary structure prediction based on position-specific scoring matrices with a prediction accuracy of over 84% (29). The tertiary structure of vaccine models was predicted via the trRosetta server due to its fast and accurate protein structure prediction algorithm (30). trRosetta generated five models of each vaccine construct and predicted per-residue LDDT, estimated TM-score, and predicted inter-residue distance and orientations (Contact, Distance, Omega, Theta, and Phi). Model 1 is selected for each vaccine construct and submitted to GalaxyWEB server which can rebuild unreliable loops or termini of the initial model structures using an optimization-based refinement method and further generate five refined models of each crude model (31). However, only model one of each was selected for submission to the ERRAT tool and PROCHECK tool for validation of protein structure (32).
Molecular docking
A chain of TLR2, TLR4, and HLA- DRB1*01:01, and the D chain of HLA- A*02:01 were downloaded from the RCSB PDB server with the accession number 1fyx (33), 2z62(34), 8euq (35) and 7m8s (36), respectively. TLR2 and TLR4 can induce antiviral immune responses by detecting the virus coat proteins and HLA of class I and class II MHC molecules to help the CD4 + T and CD8 + T cells recognize foreign antigens such as vaccines (37, 38). Receptor molecules were cleaned by Discovery Studio (39) and docking was performed using the ClusPro 2.0 server due to it is the best docking server for protein-to-protein docking (40). Coefficient weights were calculated by the formula E = 0.40Erep + − 0.40Eatt + 600Eelec + 1.00EDARS. A total of thirty models were generated in each docking case and the best model in each docking case was further submitted to the HADDOCK 2.4 server to refine the complex structure (41). Then the docked models were submitted to the PRODIGY server for predicting binding energy (42). For the investigation of interacting residues between docked chains and vaccine construct PDBsum server was utilized (43). Chimera X was used to visualize the vaccine-receptor complex (Supplementary Table 5).
Molecular dynamics simulation
iModS online tool was used to carry out molecular dynamics simulation studies (44). This torsional angles-based analytical tool measures the RMSD values, co-variance among individual residues, eigenvalue of interacting residues, and deformation of the structure to discover the stability of the complexes.
Prediction of population coverage of epitopes
Population coverage of T cell epitopes for measuring the effectiveness of the epitopes of vaccine construction was predicted by using the IEDB’s population coverage tool in default parameters where all the regions of the world were selected (45).
Immune simulation
The potential immune efficacy of these vaccines was measured by using an online C-ImmSim server which determines immune epitope and immune interaction by using the position-specific scoring matrix (PSSM) (46). All parameters were left at their default settings except for time steps during the immune stimulation. The simulation time steps were set to 1050 (1-time step equals 8 h), the default dose of each antigen injection was 1,000, and the time interval between each dose was 30 days means at day 1, 30, and 60, respectively which is the recommended interval between injections for most commercial vaccines. A novel multi-epitope vaccine against Echinococcus granulosus was used as a control (47).
Codon optimization of designed vaccine peptide for expression analysis
The vaccine sequences were reverse-translated by the backtranseq program of EMBOSS 6.0.1 to express the chimeric protein in an expression vector (48). VectorBuilder was used for codon optimization in Escherichia coli due to this tool is used for optimizing sequences with extreme GC content and simple repeats for highly efficient gene synthesis and DNA cloning applications. The results interpret the GC content percentage and the score of the codon adaption index (CAI) of the optimized codons. The expression quality was measured by codon adaption index (CAI) score where the score lies between 0.8-1.0 is considered good, although 1.0 is scrutinized as ideal score (49) and the transcriptional and translational performance is affected if the GC content range outside between 30% -70% (50). SnapGene tool was used for cloning and PCR. The E. coli pET 28a(+) was selected as an expression vector and BamHI and XhoI restriction sites were inserted at the starts and the ends of the constructs that make them feasible to be cloned in vector. SnapGene was also used to amplify the constructs in the in-silico PCR method.
Useful list of all used servers, tools, and software
A list of all utilized available server links were given in supplementary table 5.