For the identification of possible T CD4+ epitopes from immunogenic Leishmania proteins, NH, SMT, CPA, CPB, and CPC using L. panamensis as a template, linear sequences were divided into 15mer peptides along the protein, for example, peptide from 1–15 position, peptide from 2–16 position and so on. An overall of 1 856 peptides were evaluated and each one was compared against 25 HLA-DR alleles by both NetMHCIIpan and MixMHC2pred and sequences identified as Strong Binders for several alleles by both tools (30 sequences), were then run by MARIA tool. A total of 93.500 predictions were made where most sequences did not bind to any allele by any tool, and some showed good results only by one of the three tools (results not shown). Table 1 shows the best T CD4+ epitopes predicted simultaneously by the three prediction tools where 13 sequences were found to be strong binders to several alleles and to have > 90% probability of being presented.
The %Rank score is a normalization between the prediction binding score of the peptide to the HLA-DR allele in the study in comparison with the prediction of a set of random peptides, it means, the lower the value, the stronger the binding, meanwhile when using MARIA, the results are expressed as the probability of that peptide of being presented by such HLA-DR allele, which means that higher the value, higher possibility or being a true epitope. In this study, the results show the difference between the tools, in terms of different alleles identified as strong binders, however, to select the best sequences, there were detected those which all the tools in common identified as promiscuous natural Leishmania TCD4+ cell epitopes in humans. This strategy allowed us to go from 1 856 to 13 promissory sequences to this point. Considering the cutoff values set and the need for the identification of promiscuous T-cell epitopes presented by HLA-DR class II molecules identified in common by three different prediction algorithms, 6 sequences were selected for further analysis (bold letters in Table 1).
Table 1
Predicted T CD4+ epitopes by bioinformatic tools.
# | Code | Peptide sequence | NetMHCIIpan | MixMHC2pred | MARIA |
HLA-DRβ allele as strong binder (%rank ≤ 1) | Average of %rank (min-max) | HLA-DRβ allele as strong binder (%rank ≤ 1) | Average of %rank (min-max) | HLA-DRβ allele with probability of being presented (≥ 95%) | Average of % probability (min-max) |
1 | NH69 − 83 | KPLVRKVRTAPQIHG | 04:02 04:04, 04:03 | 7.83 (0.13–24.94) | 04:07, 04:04, 04:03, 04:11, 04:01, 11:04, 04:02 | 9.80(0.009–47.3) | 12:01, 13:01, 11:04, 14:02, 11:01, 01:01, 01:02, 14:01, 07:01, 04:02, 13:03, 10:01, 04:11, 13:02, 16:02, 04:04, 09:01, 15:01, 04:05, 04:03, 04:01, 03:01, 04:07, 08:02, 03:02 | 97.8 (97.3–98.6) |
2 | NH226 − 240 | TEVYETQRNTYAKVH | 04:07, 04:03, 01:01, 04:01, 10:01, 16:02 | 4.04 (0.07-10) | 04:01, 16:02. | 14.51 (0.49–50.5) | - | 92.2 (90.3–94.0) |
3 | NH244 − 258 | AVAYVIDPTVMTTNR | 04:07, 03:01, 03:02, 04:01, 13:02, 13:03, 14:02 | 5.64 (0-20.82) | 04:01, 13:03. | 24.51 (0.004–79.6) | - | 93.4 (92.0-94.5) |
4 | SMT133 − 148 | NNDYQITRARRHDAS | 07:01, 08:02, | 6.40 (0.4-16.57) | 07:01 | 16.46 (0.38–42.7) | 12:01, 13:01, 11:04, 14:02, 01:01, 11:01, 03:01, 01:02, 14:01, 04:02, 13:02, 07:01, 13:03, 04:11, 04:04, 10:01, 04:05, 04:03, 04:01, 16:02, 08:02, 15:01, 09:01, 04:07, 03:02 | 98.5 (98.2–99.1) |
5 | CPA39 − 54 | SAHFMHFKKQHGKSF | 08:02, 11:01, 11:04 | 13.53 (0.33–28.63) | 11:01, 13:01, 13:02. | 22.70 (0.27–75.9) | 11:04, 12:01, 13:01, 14:02, 11:01, | 93.2 (90.6–95.8) |
6 | CPA72 − 87 | TAVYLNAQNPHAHYD | 04:01, 04:03, 04:07, 16:02 | 5.39 (0.05–24.98) | 04:01, 04:07, 16:02 | 10.65 (0.08–38.2) | - | 91.1 (88.2–93.4) |
7 | CPA301 − 316 | KPPYWIVKNSWGTSW | 04:01, 04:03, 04:04, 04:05, 04:07, 08:02, 16:02 | 5.46 (0.13–32.57) | 04:01, 04:05, 16:02 | 28.92 (0.22–87.4) | 12:01, 01:02, 01:01, 07:01, 13:01, 04:11, 14:02, 11:04, 04:04, 10:01, 11:01, 04:02, 09:01, 04:05, 04:03, 16:02, 04:01, 14:01, 15:01, 13:03, 13:02, 04:07, 08:02, 03:02, 03:01 | 98.8 (98.1–99.2) |
8 | CPB42 − 57 | KQTYKRVYATLAEEQ | 01:01, 01:02, 04:01, 07:01, 08:02, 09:01, 10:01; 11:01, 16:02 | 6.73 (0.03–44.83) | 01:01, 04:07, 16:02 | 19.47 (0.06–59.2) | 01:01, 01:02, 12:01, 09:01, 07:01, 04:11, 10:01, 04:04, 04:01, 11:04, 04:07, 04:05, 13:01, 04:03, 14:02, 04:02, 11:01, 16:02, 14:01, 13:03, 08:02, 13:02, 15:01, 03:02. | 96.5 (94.5–97.4) |
9 | CPB102 − 117 | ATHFAKAKKFASQHY | 08:02, 11:01, 14:02 | 14.52 (0.09–34.99) | 08:02, 11:01, 13:03 | 20.54 (0.27–71.7) | - | 91.2 (88.5–93.8) |
10 | CPB113 − 128 | SQHYRKVGADLSTAP | 04:01, 04:07, 10:01 | 10.57 (0.25–40.1) | 04:07, 10:01. | 26.11 (0.29–71.1) | - | 94 (92.1–94.9) |
11 | CPB258 − 273 | NGPIAIAVDASAFMS | 04:01, 04:02, 04:03, 04:04. | 6.35 (0-19.8) | 04:01, 04:02, 04:03, 04:04, 04:07, 04:11, | 10.30 (0.005–47.8) | - | 90.9 (89.1–92.9) |
12 | CPC37 − 52 | SNRFVAEINLKAKGQ | 01:01, 01:02, 03:02, 04:01, 04:02, 04:03, 04:04, 04:05, 04:07, 04:11, 08:02, 10:01, 11:01, 13:02, 14:02, 16:02 | 1.43 (0-6.8) | 01;01, 04;01, 04:04, 04:05, 04:07, 11:01, 16:02 | 6.06 (0.004–24.6) | 01:01, 12:01, 01:02, 11:04, 13:01, 11:01, 14:02, 10:01, 09:01, 04:04, 04:01, 07:01, 04:02, 16:02, 04:11, 14:01, 04:07, 13:03, 04:05, 13:02, 04:03, 08:02, 15:01, 03:02, 03:01 | 97.5 (96.3–98.0) |
13 | CPC263 − 277 | YADFVSYKSGVYSHT | 15:03, 16:02 | 9.86 (0.4–24.8) | 13:02 | 21.40 (0.96–86.3) | 01:01 | 93.8 (91.7–95.0) |
The next step was to define the % of the identity of the sequences with different new world LC species, so we compared each peptide from the L. panamensis sequence with L. braziliensis and L. guanenesis, and with human homologous. Figure 1 shows that most peptides are 100 identical between Leishmania species assessed, except for CPA42 − 57 and CPB42 − 57 with 93.3% and 86.6% of identity respectively. Additionally, when sequences were compared to Human homolog, only CPA301 − 316 showed important identity (60%) with H. sapiens.
Figure 1
Multiple alignments between selected peptides and Leishmania species or human homologs.
Foot note Fig. 1. L.p: L. panamensis; L.b: L. braziliensis; L.g: L. guyanensis; H.s: Homo sapiens NSS: No significant similarity found, § Cathepsin AAC78838,1; † Cathepsin AAC78839,1; ‡ CathepsinB NP_001371643; An * (asterisk) indicates positions which have a single, fully conserved residue. A : (colon) indicates conservation between groups of strongly similar properties as below. A. (period) indicates conservation between groups of weakly similar properties as below. Red: Small (small + hydrophobic (incl. aromatic -Y)); Blue Acidic; Magenta: Basic – H. Green: Hydroxyl + sulfhydryl + amine + G.
To confirm the previous prediction results, molecular docking was performed comparing these 6 selected sequences against the pocket of HLA-DR4. Molecular docking was performed using the MDockPep server to simulate and analyze the interaction between the HLA-DR4 receptor and peptide sequences (ligand). Numerous docking structures (an average of 10 structures per peptide sequence) were generated, based mainly on binding energy (kJ/mol). The best models were chosen according to the lowest binding energy of each of the protein-peptide interactions, for each of the sequences.
The total binding energy (kJ.mol− 1) is the result of the consensus of different intra- and intermolecular interactions between the peptide (ligand) and the DR protein (Receptor) and is used predictively to identify the peptide sequences with the highest affinity for the MHC-class II complex in the search and design of vaccines against various pathogens. In this sense, favorable interactions contribute to the binding strength of the pocket (Alpha and beta chain) of the DR4 protein and the peptide. These forces are modulated by the presence of hydrogen bonds (HB), which occur more frequently in biological systems contributing to the specificity of the binding and hydrophobic interactions, which play an important role in the stability of molecular complexes and the total binding energy. All these parameters were considered to order the peptide sequences according to their binding energy (kJ.mol− 1).
The selected models showed interaction energy from − 185.22 Kj.mol− 1 to -233.90 Kj.mol− 1. Employing the DS Studio platform and Autodock tools, the intra and intermolecular interactions of the best models were visualized, showing the predominance of favorable interactions (15 to 33 interactions), hydrogen bridges (20 to 11 HB), and hydrophobic interactions (3 to 12 bonds). The lower presence of unfavorable interactions (0 to 9 interactions) and salt bridges (0 to 4 bridges) was visualized (Table 2).
Table 2
Molecular docking analysis between MHC class 2 (DR4) and selected peptide sequences of Leishmania spp.
Protein | Peptide | DR4 (Kj/mol) | favorable interaction DR4 | unfavorable Interaction | HB total | Total Hydrophobic interaction | Salt bridge |
NH | NH69-83 | -191.88 | 29 | 3 | 13 | 10 | 2 | |
SMT | SMT133-148 | -233.90 | 33 | 6 | 16 | 12 | 2 | |
CPA | CPA39-54 | -229.14 | 21 | 2 | 11 | 8 | 0 | |
CPA301-316 | -228.65 | 28 | 1 | 18 | 8 | 2 | |
CPB | CPB42-57 | -210.19 | 27 | 9 | 16 | 8 | 0 |
CPC | CPC37-52 | -185.22 | 20 | 5 | 11 | 9 | 1 |
Total binding Energy (kJ.mol− 1). HB. Hydrogen bond. Data was obtained from the MDOCK-DStudio visualizer and AutoDock tools.
The analysis of intra and intermolecular interactions displayed that the lower binding energy of sequences was in the peptide-derived SMT protein (-233,90 kJ.mol− 1), related to the number of favorable interactions and total hydrophobic interactions (Fig. 2). The peptide sequence interacts with several AAs of the A and B chains of MHC class II, with the presence of conventional hydrogen bonds (green circle), carbon bonds (orange circle), and hydrophobic bonds (pink circle) (Fig. 2A).
Figure 2. 2D diagram of Molecular docking protein-peptides.
Footnote Fig. 2. A. HLA-DR4 – SMT133 − 148 peptide. B. HLA-DR4-CPA39 − 54 .peptide. C.HLA-DR4- CPA301 − 316 peptide. D.HLA-DR4 – CPB42 − 57 peptide. E .HLA-DR4 – NH69 − 83 peptide. F. HLA-DR4-CPC37 − 52. Intra and intermolecular interactions. Green. Conventional hydrogen bonds. Pink. Hydrophobic interactions. Orange. Carbon bonds. Red. Unfavorable bonds. Shadows. Solvent accessible surface.
The CPA protein sequences have binding energies like those of the SMT protein peptide. The 2D models show mostly conventional hydrogen bonds, followed by attraction and hydrophobic charges. More soluble accessible surfaces are also present (Fig. 2B-C). These results are consequent to the binding energies of sequences CPA39-54 (-229,14 kJ.mol-1) and CPA301-316 (-228,65 kJ.mol-1), and the few numbers of unfavorable interactions (Table 2).
Among the selected CPB protein sequences, it was observed that peptides 42–57 had binding energy (-210.19 kJ.mol-1). The analysis of inter- and intramolecular interactions showed the predominance of hydrogen bonds and hydrophobic bonds between peptides and proteins, as well as the low amount of carbon bonds and solvent-accessible surfaces (Fig. 2D). The 69–83 sequence of the NH protein also showed low binding energies (-191,88 kJ.mol-1), exhibiting the presence of favorable interactions, related to conventional hydrogen bonds, carbon bonds, and hydrophobic bonds within the peptide-protein complex (Fig. 2E).
The peptide sequence derived from the CPC protein showed binding energies of -182.22 kJ.mol-1 to MHC class II. This energy was the highest of those resulting from the other sequences analyzed, correlating with fewer numbers of favorable interactions (Table 2). The 2D analysis shows the presence of conventional hydrogen bridges, and hydrogen bridges, but little presence of carbon bridges; additionally, unfavorable interactions (red circle) and the absence of solvent-accessible surfaces are evident (Fig. 2F).