Severe life-threatening signs and symptoms have been identified in numerous studies worldwide in patients with P. vivax 6,7. Populations of P. vivax based on the genetic variations are important factors in understanding the dynamics of vivax malaria transmission 47. In order to study patterns of genetic variation, well-characterized polymorphic regions of both the pre-erythrocytic and erythrocytic stages of P. vivax have been analyzed. CSP is required in exo-erythrocytic cycle and an important molecular marker for understanding the diversity of P. vivax in population genetics studies31. In terms of antigenicity and polymorphism, CSP is a primary target for anti-infection vaccines and has been extensively studied. Pvcsp, a single copy gene, codes for the highly immunogenetic main sporozoite surface protein. For the Pvcsp gene, three different genotypes, VK210, VK247, P. vivax-like, were identified on the basis of variation in the number of the peptide repeat motifs (PRMs) and sequences in the PvCSP central repeat domain 48. It is found that the various genotypes are globally distributed with regional biases, where VK210 predominates in the endemic regions, while VK247 is reported from the regions with mixed infections cases 49,50.
P. vivax CSP protein sequences retrieval, antigenicity and transmembrane helices prediction.
To identify the potential vaccine candidates, the complete sequences of VK210 and VK247 variant types of Pakistani PvCSP gene have been obtained from the NCBI nucleotide database in FASTA format with accession numbers MT222296.1, protein-id= QNU13124.1 and MT222299.1, protein-id QNU13127.1, respectively. To verify the homology in protein database, the retrieved sequences were submitted to BLASTx (https://blast.ncbi.nlm.nih.gov/ Blast.cgi). The antigenic nature of both the selected proteins, PvCSP210 and PvCSP247, was reported using ANTIGENpro. For both the selected proteins the computed VaxiJen antigen score was ≥0.823, i.e. 0.9366 and 0.8409, respectively. Additionally, taking into account the reported fact that protein transmembrane nature is vital in expressing the antigenicity and accessibility to the immune system51, the transmembrane topology was conducted using TMHMM server. Topology analysis indicated that both the selected proteins secured outer membrane locations (Fig.2).
B-cell epitopes selection
For the peptide-based vaccines design, B-cells epitope prediction is commonly used to classify the sections of protein sequences as peptides that have resulted in the strong antibody response 52. B-cells have been reported to secrete the antibodies, also known as immunoglobulin (soluble), that induce adaptive humoral immunity 42. To predict the B-cells epitopes, the primary sequences of PvCSP210 and PvCSP247 were scanned using the BCpred server. The screened epitopes have been shortlisted on the basis of the prediction score. For the final vaccine constructs, the BCpred server calculated one linear B-cell epitope against each strain with the BCpred scores of 1.5 and 1.3 (> 0.8), VaxiJen scores 1.2889, 1.1245 (> 0.4) and exomembrane topology (Table 1).
Table1
Predicted B-cell epitopes for PvCSP210 and PvCSP247by BCpred server, antigenicity, Vaxigen score and membrane topology.
Protein Name
|
B-Cell Epitope
|
Start-End Position
|
BCPred Score
|
Antigenicity
|
Vaxigen Score
|
Topology
|
PvCSP210
|
QSASRGRGLGENPDDEEGDAKKKKD
GKKAEPKNPRENKLKQPGDRADGQPA
GDRADGQPAGDRAAGQPAGDRADGQ
PAGDRAAGQPAGDRADGQPAGDRAAG
QPAGDRADGQPAGDRAAGQPAGDRAD
GQPAGDRAAGQPAGDRAAGQPAGDRA
DGQPAGDRAAGQPAGDRAAGQPAGDR
AAGQPAGDRAAGQPAGNGAGGQAAG
GNAANKKAEDAGGNAGGQGQNNEGA
NAPNEKS
|
54-290
|
1.947
|
Antigen
|
1.2889
|
Exomembrane
|
PvCSP247
|
QSASRGRGLGENPDDEEGDAKKKKD
GKKAEPKNPRENKLKQPEDGAGDQP
GANGAGNQPGANGAGNQPGANGAGN
QPGANGAGNQPGANGAGNQPGANGA
GNQPGANGAGDQPGANGAGNQPGAN
GAGDQPGANGAGNQPGANGAGNQPG
ANGAGDQPGANGAGNQPGANGAGDQ
PGANGAGNQPGANGAGNQPGANGAG
NQPGANGAGNQPGANGAGGQAAGG
NAANKKAGDAGAGQGQNNEGAN
APNEKS
|
54-305
|
2.089
|
Antigen
|
1.1245
|
Exomembrane
|
T-cell epitopes selection
Cytotoxic T-lymphocyte epitopes play a crucial role in the development of new vaccines. Most notably, relative to wet lab studies, it is cost effective and time saving 53. On the surface of an antigen presenting cell (APC), T-cell epitopes are presented where they are connected to key histocompatibility (MHC) molecules in order to induce immune response 54. Considering the importance of T-cell epitopes, the IEDB server was used to predict the MCH I and II T-cells epitopes for both the selected malarial proteins. Using the aforementioned B-cell epitopes, the server predicted the T-cell (9-mer) epitopes with the lowest percentile score. For each protein, a comparison of the predicted MHC (I and II) sequences was made in order to choose only the overlapping epitopes. As a result of this comparison, six T-cell epitopes for PvCSP210, while seven for PvCSP247, which showed interactions with DRB1*0101 allele with an IC50 value ranging from 7.6 to 97.5 nM were prediction. The DRB*0101 allele is common in human population and its binding to epitopes is an upshot in successful recognition of the host immune system 55. Furthermore, the screening of all the selected DRB1*0101 binders was complete by re-conducting the antigenicity as well as allergenisity testing (Table 2). All the selected epitopes are also non-toxic and greatly soluble in water.
Vaccines construction
MEVs are more advantageous in comparison with single-epitope vaccines or conventional vaccines 56, 57. MEVs are cost effective, time saving, stable and specific and are even safer as they don’t contain the entire pathogen 58. Additionally, they are believed to induce significant and broad spectrum humoral and cellular immune responses simultaneously because of the presence of diverse T-cell as well as B-cell epitopes. These vaccines are often conjugated with adjuvant, they believed to developed long-lasting immune responses and increased immunogenicity also unwanted components that can induce either pathological immune reactions or adverse effects are minimized. For current study, the MEV constructs for each of the two proteins were developed by joining the selected T-cells. Synthetic TLR-4 agonist known as RS-09 (Sequence: APPHALS) was selected as an adjuvant for both sequences 66. B-cell overlapping epitopes were fused using the AAY linker, while the adjuvant was attached joined to the N terminals of the both of the vaccines constructs using the EAAAK linker. Linkers have been reported to play an important role in boosting the designed constructs to act as an individual immunogen and to generate higher titer of antibody compared to single immunogen 59. The final vaccine sequences for PvCSP210 and PvCSP247 were 81 and 93 amino acids, respectively.
Allergenicity, antigenicity and physiochemical properties prediction
A vaccine that is non-allergenic in nature and doesn’t induce any allergic reaction in the body is considered as a potential vaccine. Thus, we conducted an allerginicity and safety test for our designed vaccine constructs using Allertop server. The Allertop server predicted the both the vaccine sequences are non-allergen. ANTIGENpro also affirms the non-allergic nature of vaccines with predicted probability of Antigenicity 0.10 and 0.13, respectively. The antigenic potential of the vaccine sequences was determined by using the VaxiJen server. For both the constructs the calculated antigenicity was 0.86 and 0.89 at a threshold 0.4, suggesting that malarial vaccines reported in current study are antigenic and capable of triggering an immune response. Further to ensure the stability of vaccine sequences, various physiochemical parameters were computed using ExPASyProtParam server. The physiochemical parameters along with the predicted probability of antigenicity and antigen scores are shown in Table 3.
Table 3
Allergenicity, Antigenicity and Physiochemical Properties of Designed Vaccines.
Vaccine Construct
|
Allergenicity
|
Predicted Probability of Antigenicity
|
VaxiJen Score
|
Physiochemical Parameters
|
PvCSP210
|
Non-Allergen
|
0.108906
|
0.8620
|
Molecular Weight: 8852.88
Aliphatic index: 42.72
Theoretical pI: 9.37
Instability index: 34.05
GRAVY : -1.522
Estimated half-life in human reticulocytes:
>4.4 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (E. coli, in vivo).
|
PvCSP247
|
Non-Allergen
|
0.136894
|
0.8968
|
Molecular Weight: 10186.31
Aliphatic index: 42.72
Theoretical pI: 9.16
Instability index: 44.20
GRAVY: -1.511
Estimated half-life in human reticulocytes:
>4.4 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (E. coli, in vivo).
|
Tertiary structures prediction and validation
Using the 3Dpro of Scratch Protein server, 3D structures of MEVs were constructed ab initio since no appropriate template was available for the structure modeling and optimized using the Galaxy Refine server. To ensure that the modeled structures are similar to the actual ones, the generated models have been validated using Prosa-web, ERRAT and PROCHECK servers. The z-scores suggested that the constructed models are closet to natural proteins as estimated scores (-2.01 and -2.15) lies within the range specified for natural proteins. The stability of the predicted models was further confirmed by the ERRAT scores of 72.69 and 97.6. Ramachandran plot produced using the PROCHECK server indicated that most of the residues for PvCSP210 MEV model were in the most favored regions (77.9 %), 20.6% were in the additional allowed region, 1.5% were in disallowed regions while no residue was found in the generously allowed regions. Similarly, the percentages for the PvCSP247 MEV model were: 91.4% (most favored regions), 7.4% (additional allowed region) and 1.2% (disallowed regions). Like PvCSP210 model, no residues were found in the generously allowed regions of PvCSP 247 model (Fig. 3, 4).
CABS-Flex and aggregation-prone regions analysis
PvCSP210 and PvCSP247 were further subjected to CABS-Flex analysis which produced 10 models after simulation. PvCSP210 maximum RMSF found is 6.23 Å and minimum RMSF is 1.44 Å. PvCSP247 lowest RMSF score is 0.20 Å whereas highest RMSF is 4.07Å. Compared to PvCSP210, PvCSP247 has less residual flexibility and more stability (Fig. 5). Additionally, both the vaccine candidates do not have any aggregation prone regions (Fig.6).
Disulfide engineering
To further improve the stability of predicted 3D structures, disulfide bonds using Disulfide by design v2.0have been incorporated in the models. It was found that there are total 14 pairs of residues for the PvCSP210 and 12 pair of residues for PvCSP247 which can serve the purpose of disulfide engineering (S-Table 1 and 2).
Docking interaction analysis
Appropriate conformational docking of the vaccine candidate to the immune receptor molecules is a key to activating of host immune responses. The innate immune system starts to function immediately after the antigen appears in the body. Toll like receptors (TLRs) play a significant role in the innate immunity by recognizing pathogen derived structurally conserved molecules. This recognition marks several molecular events, allowing the activation of innate immunity leading to acquired immunity specific to the antigen. A blind docking method was employed to forecast the ideal binding mode of the MEVs with respect to the TLR4 receptor. This was made possible by providing the full receptor surface to the constructs for binding that is efficient and stable. Both PvCSP210 and PvCSP247 vaccine ensembles docked deep inside the binding cavities of the TLR4 receptor. In case of PvCSP210, the global binding energy of the top solution is -6.34 kJ/mol (Table 4), where the vaccine construct enjoys maximum stability at the cavity between the cavity of Chain A and C of the receptor (Fig.7a). To this energy, major energy contribution was observed from attractive van der Waal energy of - 6.39 kJ/mol, followed by -1.14 kJ/mol of hydrogen bonding energy. The repulsive van der Waal as well as atomic contact energy (ACE) contributed unfavorably to the complex formation and intermolecular interactions. At the molecular level, the vaccine construct interacts with Phe408, His431, Val475, Val524, Phe573, Asp550 and Asn575 of Chain A, Lys435, Gln436, Met437, His458, Thr459, and Arg460 of Chain B, Tyr75, Gln73, Pro88, Lys89 Arg90, Lys91, Glu92, Val93, Pro142, and Gly143 of Chain C within 3 Å both by strong hydrogen bonding and weak hydrophobic interactions. For PvCSP247, the global binding energy in formation of complex with TLR4 receptor is -2.3 kJ/mol. Likewise, the first complex, the attractive van der Waals energy dominates the intermolecular overall interaction energy with net contribution of -22.46 kJ/mol. The hydrogen bonding energy that governs the complex is -3.53 kJ/mol. The PvCSP247 favors binding in cavity of Chain B and D (Fig.7b), where it interacts both hydrophobically and hydrophilically with Chain B residues (Trp26, Glu27, Cys29, Val30, Val32, and Gln39) and Chain D residues (Lys39, Met40, Tyr42, Arg69, Glu136 and Glu144) in diameter of 3 Å.
Table 4
Molecular docking solutions of the both vaccine constructs with TLR4.
Solution Rank
|
Solution Iteration
(PvCSP210 /PvCSP247)
|
Docking Global Energy (PvCSP210)
|
Docking Global Energy (PvCSP247)
|
1
|
9 /6
|
-6.34
|
-2.3
|
2
|
10 /1
|
16.45
|
6.09
|
3
|
7 /2
|
38.96
|
6.77
|
4
|
5 /8
|
50.45
|
12.91
|
5
|
1 /10
|
275.6
|
22.36
|
6
|
2 /5
|
597.43
|
74.07
|
7
|
6 /9
|
839.79
|
290.53
|
8
|
3 /3
|
1124.17
|
894.19
|
9
|
8 /4
|
1605.63
|
1240.5
|
10
|
4 /7
|
2438.86
|
2074.63
|
MD simulation analysis
A long run of MD simulation was performed on the docked solutions to decipher atomic level interactions dynamics, thus structural integrity and stability of the complexes. Two main analyses were performed on the simulation trajectories. First, RMSD of the TLR4 receptor in the presence of vaccine ensemble of both complexes was determined. RMSD allows measuring the average distance between atoms of the superimposed reference and dynamics protein conformation. For PvCSP210-TLR4 system, maximum RMSD is 6.16 Å and mean RMSD is 4.96 Å. The RMSD plot of the backbone atoms was noticed in increasing deviations in the first phase in the time period of 0-12.4 ns, followed by somewhat uniform fluctuations till 82.4 ns. After, small structural adjustments were observed towards the end period. Upon visual examination, it was revealed that the minor structural changes in the TLR4 are the attribute of larger receptor size comprising multiple chains. The chains small dynamics moves does not affect overall TLR4 structural conformation and stability of the docked PvCSP210. Superimposition of the first and last snapshot of the PvCSP210-TLR4 complex does not infer any structural deviations and the estimated RMSD (1.12 Å) is within high conformation stability range. In contrast, PvCSP247-TLR4 system, TLR4 maximum RMSD is 6.77 Å whereas average RMSD is 4.49 Å (Fig.8a). Both the vaccine systems reported to have the same dynamics behavior demonstrating the vaccine construct binding mode and interaction favorable. Thus, the complex stability gives a proof of docking prediction, the good association between the vaccine candidates and TLR4 might provoke strong and specific immunological response against the pathogen. Additionally, we carried out RMSF analysis to highlight residues that contribute significantly in intermolecular interactions and overall stability of the complex. RMSF analysis found that in case of PvCSP210-TLR4, maximum and mean RMSF calculated is 10.87 Å and 1.67 Å, respectively. Similarly, PvCSP247-TLR4, the maximum RMSD calculated is 7.07 Å, whereas the mean RMSF is 1.75 Å (Fig.8b). The latter system is more stable compared to the former. Residue fluctuations of the PvCSP210 vaccine construct with TLR4 is more than the PvCSP247 thus contributing more to dynamics and less to stability. However, these local fluctuations seem not to disturb overall binding of the vaccine and do not allow escape of the vaccine from its initial docked site. One possible reason of the PvCSP210 conformational deviations is a way to succeed in achieving the best binding conformation.
MMPB/GBSA binding free energy analysis
Further, the binding affinity of the vaccine ensembles was validated using MMGB/PBSA methods. The complete simulation trajectories were used in the binding free energy estimation of complexes, where 100 frames were picked 100 frames apart. As shown in supplementary Tables S-Table 3, S-Table 4, the net binding energy of PvCSP210-TLR4 complex and PvCSP247-TLR4 complex is -50.49/-117.15 kcal/mol (MMGBSA/MMPSA) and -52.94/-96.26 kcal/mol (MMGBSA/MMPSA), respectively. The MMPBSA net binding energy demonstrates both systems interaction energies highly favorable and stable compared to the MMGBSA system. MMGB/PBSA gas phase energy for PvCSP210-TLR4 complex is -1853.86/-1853.86 kcal/mol whereas for PvCSP247-TLR4 complex, this energy is -52.94/-96.26 kcal/mol. For both vaccines, beneficial contribution was observed to the gas phase energy majorly from electrostatic energy and minor from van der Waals. On the other hand, the MMGB/PBSA solvation energy is 1803.81/-1736.70 kcal/mol for PvCSP210-TLR4 complex and 1402.00/1358.68 kcal/mol for PvCSP247-TLR4 complex. The solvation energy in both cases is dominated by non-favorable polar solvation energy. Detail contribution of each energy term in interaction of the vaccines with TLR4 is tabulated in S-Table3 and S-Table 4.
Binding free energy decomposition analysis
Decomposition of the net binding energy was achieved further to estimate each residue contribution in the binding and to understand the impact of residues substitution on intermolecular affinity. The amino acids net binding energy consists of solvation and molecular mechanics energy but not entropy changes. Generally, amino acids with energy contribution < -1.0 believed to play significant role in overall binding. These hotspot residues in both PvCSP210-TLR4 and PvCSP247-TLR4 complexes are presented in S-Table 5 and S-Table 6 along with their net binding energy. Out of these, majority of the residues reside within or to close proximity of the vaccine binding. Further, these key residues are involved in hydrophobic interactions.
Cloning and codon optimization
The primary objective of performing in silico cloning was to express the MEVs of P. vivax in E. coli host. At first place the MEVs codons were modified as per codon utilization of E. coli expression system. Subsequently, the modified codons of MEVs were configured in accordance with E. coli strain K12 using JCAT server. The GC content of the optimized PvCSP210 and PvCSP247 was found to be 48.97 and 47.67, whereas 1 was the value for codon adaptive index for both vaccine candidates. The values measured were close to 1.0 and were acceptable 60. The next step was the creation of XhoI and NdeI restriction sites followed by the cloning in the pET28a (+) vector (S-Fig.1). MEVs sequences were also integrated between six histidine residues on both ends to make the purification process effortless. 123kbp and kbp was the total lengths of the clones.
Computational immune simulation
Computational immune simulation findings showed that the vaccines single dose was sufficient to potentially provoke different immunoglobulins. In contrast to the primary response, both secondary and tertiary responses were distinguished by increased levels of IgM, IgG1 + IgG2, and IgG + IgM antibodies (Fig. 9a and c). Additionally, an increased level activation of B-cells, predominantly IgM and IgG1, has been noted. This activation was evident in the major development of memory cell. Similarly, the number of active T-cells increased significantly during the secondary and tertiary responses and progressively decreased later on. During the introduction of MEVs, increased activity of macrophages, significant levels of T-cells and dendritic cells were observed. Increased cytokins levels (IFN-γ and IL-2) have also been seen as a result of subsequent exposure (Fig.9b and d). The aforementioned observations signify that the proposed multi-epitope vaccine constructs resulted in robust immune responses.