Acquisition and analysis of new mutations in target variants
The tracing of variants HV.1, HK.3, BA.2.86, and JN.1 led to the identification of 15 noteworthy variants, accompanied by the compilation of newly emerged mutations extracted from Cov-Lineages (https://cov-lineages.org/lineage_list.html) and outbreak.info (https://outbreak.info/).
Retrieval of target proteins sequence
The animo acid sequence of S protein (Accession No.: YP_009724390.1), M protein (Accession No.: YP_009724393.1), N protein (Accession No.: YP_009724397.2), ORF1a protein (Accession No.:YP_009725295.1), ORF1b (Accession No.: BCN86436.1), ORF7b protein (Accession No.: YP_009725318.1) and ORF10(Accession No.: YP_009725255.1) were retrieved from NCBI protein database (https://www.ncbi.nlm.nih.gov/) in FASTA format.
Epitopes prediction
Cell-mediated immunity plays a pivotal role in providing resistance to diseases and promoting vaccine-induced protection against COVID-19 [30]. In humans, T-cell activation relies heavily on antigen presentation by the human leukocyte antigen (HLA) [31]. Alleles of HLA-A, HLA-B, HLA-C and HLA-DRB1 with frequency definitions exceeding 0.2 worldwide were selected from the Allele frequency net database (http://www.allelefrequencies.net/). To identify potential CD8 + epitopes interacting with HLA Ⅰ class alleles, the NetMHCpan4.1 (https://services.healthtech.dtu.dk/services/NetMHCpan-4.1/) server was employed. Epitopes of 9 and 10 amino acids in length were predicted, with binding thresholds set at 0.5% and 2% for strong and weak binding respectively. Additionally, for the prediction of CD4 + T epitopes, the NetMHCIIpan 4.3 (https://services.healthtech.dtu.dk/services/NetMHCIIpan-4.3/) server was utilized to forecast the binding affinity and percentile grade of 15 amino acids in length interacting with HLA Ⅱ class alleles. Strong and weak binder identification was based on binding strength thresholds of 1% and 5%, respectively. Moreover, B-cell epitopes play a crucial role in inducing B-lymphocytes to produce antibodies [32]. The ABCPred (https://webs.iiitd.edu.in/raghava/abcpred/) server was used to predict linear B-cell epitopes spanning 16 amino acids, with scores above 0.6 included in the candidate list.
Epitopes processing
The predicted epitopes were further selected for antigenicity using the Vaxijen v.2.0 server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html), with a threshold for viruses antigens set at 0.4 [33]. Additionally, allergenicity assessment was conducted using AllerTOP v.2.0 server (https://www.ddg-pharmfac.net/AllerTOP/), considering amino acid properties such as hydrophobicity, size, abundance, and tendencies to form helix and β-strand [34]. Furthermore, ToxinPred (http://crdd.osdd.net/raghava/toxinpred/) was employed to predict toxicity of the protein sequences. Moreover, a range of different physicochemical characteristics of these epitopes were elucidated through ProtParam tool (https://web.expasy.org/protparam/) available on the ExPASy platform. For T cell epitopes, the immunogenicity was further assessed using Class I Immunogenicity module of the Immune Epitope Database (IEDB, http://tools.iedb.org/immunogenicity/) to assess class Ⅰ epitopes. Additionally, the convolutional neural network model DeepNeo (https://deepneo.net/) was utilized to predict the immunogenicity of class Ⅱ epitopes. To determine the similarity of potential epitopes to host proteins, a BLASTP analysis (https://blast.ncbi.nlm.nih.gov/Blast.cgi) was conducted against the nonredundant protein sequence database, with Homo sapiens (NCBI taxid: 9606) selected as the target organism [35]. An e-value cutoff of 0.01 was applied to filter the results [36].
Population coverage
To ensure the effective performance of a multi-epitope vaccine, it is imperative that the epitopes possess broad HLA binding specificity, enabling to produce robust immune response across diverse populations worldwide. To assess the population coverage of the selected HLA Ⅰ and HLA Ⅱ epitopes for vaccine construction, the IEDB (http://tools.iedb.org/population/) population coverage tool was employed.
Designing of multi-epitope vaccine
The immunogenicity drawback of epitopes vaccines can be surmounted by combining antigenic epitopes with appropriate adjuvants to construct a multi-epitopes vaccine [37]. In the construction process, CTL epitopes were linked with AAY linker, while HTL epitopes were conjoined both to each other and to CTL epitopes using GGPPG linker. Subsequently, B cell epitopes followed HTL epitopes with KK linker. Additionally, a 6xHis tag (HHHHHH) was attached to the carboxyl terminus of the vaccine construct. In this study, two different adjuvants, 50s Ribosomal L7/12 (Accession No.: WP_000028878.1) [38] and Human Beta-defensin 3 (Accession No.: AAV41025.1) [39] were used, each attached at the N-terminus of the vaccine. The fusion between the adjuvant and the vaccine construct was achieved using an EAAAK linker [40]. These two vaccines with different adjuvants underwent comprehensive validation once again by Vaxijen v.2.0, AllerTOP v.2.0, ToxinPred, ProtParam and Protein-sol (https://protein-sol.manchester.ac.uk/patches) for antigenicity, allergy, toxicity, physicochemical properties and solubility, respectively.
Predicting and verifying of multi-epitope vaccine
The secondary structure elements of the complete vaccine component, such as alpha-helix, extended strand, and random coil, were predicted using the PSIPRED server (http://bioinf.cs.ucl.ac.uk/psipred/) [41]. These elements form the structural scaffold of the protein, which is crucial for predicting its function, identifying active sites, determining interaction regions and ensuring stability [42]. Subsequently, the three-dimensional structure was modelled using the I-TASSER (https://zhanggroup.org/I-TASSER/). This server employs multiple threading alignments to search for template PDB structures for modeling, and through iterative fragment assembly simulations, ultimately establishes the three-dimensional structure of the target sequence [43]. Follow this, the three-dimensional structure underwent further refinement using the Refine tool in the GalaxyWEB server (https://galaxy.seoklab.org/) to enhance model accuracy. To assess the refined tertiary structure, the ProSA-web server provided a z-score indicating discrepancies between the tertiary vaccine structure and models derived from nuclear magnetic resonance (NMR) and X-ray diffraction (XRD). Additionally, the Ramachandran plot of the vaccine structure was generated by RamachanDraw library in Python.
Prediction of B-cell conformational epitopes
The folding of a protein spatially aligns distant residues, resulting in the creation of discontinuous B-cell epitopes. Unlike linear epitopes, conformational epitopes account for a larger proportion of B-cell responses, making their prediction pivotal in vaccine design [44]. Discontinuous B-cell epitopes within the developed construct were predicted using the ElliPro server (http://tools.iedb.org/ellipro/) [45].
Disulfide engineering of multi-epitope vaccine
Disulfide bonds are integral to the function and stability of many naturally occurring proteins [46]. For multi-epitope vaccines, structural stability is essential to maintain immunogenicity and efficacy. The Disulfide by Design 2.13 server (http://cptweb.cpt.wayne.edu/DbD2/) was employed to evaluate and design potential disulfide bonds. Given that 90% of natural disulfide bonds exhibit energy values below 2.2 kcal/mol and χ3 dihedral angles peaking between − 87°and + 97°, residue pairs satisfying these criteria were selected to form disulfide bonds [47].
Molecular docking analysis of multi-epitope vaccine
Molecular docking is employed to estimate the binding affinity between vaccine construct and immune cell receptors, aiming to determine the optimal vaccine conformation to generate a protective immune response. Four immune receptors, namely TLR3 (PDB ID: 1ZIW), TLR4 (PDB ID: 3FXI), TLR5 (PDB ID: 3J0A) and TLR7 (PDB ID: 7CYN) were used in this study. The docking process was performed using the protein-protein docking panel of Schrödinger software (LLC, NY, USA, 2023-1), which is based on PIPER algorithm for sampling. The ranking of docked conformations was established by identifying the centers of highly populated clusters of the low-energy conformations [48]. The most populated clusters were selected for subsequent structural analysis.
Molecular dynamics simulation
Molecular dynamic simulation analysis is a computational approach commonly employed to analyze of the dynamic behavior of molecules within a specific environment and time frame. In the context of evaluating the binding affinity and conformational stability of vaccine construct and immune cell receptors compound, molecular dynamics simulations were conducted using Schrödinger software (LLC, NY, USA, 2023-1) over a duration of 100 nanoseconds under NPT ensemble conditions, which maintained constant particle number, pressure and temperature. During the simulation, the root mean square deviation (RMSD), root mean square fluctuation (RMSF) and radius of gyration (Rg) values of vaccine construct and receptor were monitored and analyzed. The resulting trajectory from the simulation provides insights into the conformational changes occurring complex structure over the course of the simulation period. Furthermore, the binding free affinity and dissociation constant (KD) were calculated using the PRODIGY web server (https://wenmr.science.uu.nl/prodigy/), which relies on intermolecular contacts and properties extracted from the non-interface surface [49].
Immune simulation
The analysis of immune response elicited by the vaccine construct was conducted using the C-ImmSim immune stimulator (https://kraken.iac.rm.cnr.it/C-IMMSIM/index.php). In this study, the simulation parameters were set as follows: a randomized seed of 12345, a simulated volume of 10, and a simulated step number of 1000. The vaccine was administered every four weeks for a total of 3 doses to assess the capacity of immune cells to generate specific antibodies and various cytokines. Given that the injection time step per unit of this server corresponds to 8 hours in the real-world time, the time steps for the three injections were designated as 1, 84 and 168, respectively. This enabled the simulation to accurately simulate the temporal dynamics of the immune response following vaccination.
Codon Adaptation and In Silico Cloning
To ensure efficient expression and production in the specified host organism, protein sequences were back-transcribed using the Sequence Manipulation Suite (https://sites.ualberta.ca/~stothard/javascript/rev_trans.html) to produce DNA sequences representing the most probable non-degenerate coding sequences [50]. The optimal sequences were evaluated based on the codon adaptive index (CAI) and GC content score, which were calculated using GenScript (https://www.genscript.com/tools/rare-codon-analysis). CAI offers valuable codon usage information, with a score above 0.8 being considered favorable. Meanwhile, an optimal GC content should full within the range of 30%-70% [51]. E. coli strain was chosen as the host organism due to its favorable characteristics, such as ease of handling and suitability for mass production [52]. Finally, SnapGene software version6.0.2 (https://www.snapgene.com/free-trial) was used to adapt the sequence, which was then inserted between XhoI and SalI sites of the pET28a (+) vector.
Acquisition of neutralizing antibody structures and analysis of hotspot mutations
To assess the impact of the newly added L455S mutation on the S protein of the JN.1 variant, a comprehensive evaluation of neutralizing antibody efficacy was conducted by combining it with the adjacent F456L mutation. The antibody structure was obtained from the CoV-AbDab (https://opig.stats.ox.ac.uk/webapps/covabdab/) database. The effective neutralizing antibodies against SARS-CoV-2-Omicron XBB variant were selected, with a focus on the receptor binding domain (RBD) region of S protein. The L455S and F456L mutations analysis of neutralizing antibody was calculated using Residue Scanning Calculation module of Schrödinger software (LLC, NY, USA, 2023-1). The results post-mutation was visualized using PyMOL software.