3.1. Bioinformatics studies
3.1.1. The whole procedures used for multi-epitope construction design is demonstrated in Fig. 1. Sequences related to nine antigenic peptides were received from NCBI server.
3.1.2. Prediction of MHC-II epitopes
HLA-DR was selected for parameters of locus MHC-II, and DRB1*01:01 and DRB1*01:02 were adjusted as α and β chains based on data calculated from Allele frequency net (AFND) for Iranian population
http://www.allelefrequencies.net/hla6006a.asp
(González-Galarza, Takeshita et al. 2014). In order to predict MHC-II epitopes, three different servers (IEDB, ProPred, and RANKPEP were used, the results of which are shown in Table 1.
Table 1: Prediction of MHC II epitopes related to 9 antigenic proteins by 3 different servers
Target Antigens
|
Start–end position
|
Sequences
|
GP63
|
15 TO 29
|
AARLVRLAAAGAAVI
|
235 TO 249
|
PAVGVINIPAANIAS
|
296 TO 310
|
INSSTAVAKAREQYG
|
76 TO 90
|
LPYVTLDTAAAADRR
|
610 TO 624
|
LGMVLSLMALVVVWL
|
KMP-11
|
20 TO 34
|
NRKMQEQNAKFFADK
|
48 TO 62
|
YEKFERMIKEHTEKF
|
70 TO 84
|
SEHFKQKFAELLEQQ
|
74 TO 88
|
KQKFAELLEQQKAAQ
|
CPB
|
18 TO 32
|
VLRILSLTSRRAAAV
|
21 TO 35
|
ILSLTSRRAAAVKDR
|
88 TO 102
|
AGALVMGTALLTESA
|
102 TO 116
|
ADEGATTTSHSHASH
|
474 TO 488
|
NSPSSAQSPQRRVLS
|
H1
|
4 TO 18
|
DSAVAALSAAMTSPQ
|
27 TO 41
|
KTAAKKAAAKKAGAK
|
32 TO 46
|
KAAAKKAGAKKAGAK
|
73 TO 87
|
KVAKKVAKKPAKKAA
|
A2
|
16 TO 30
|
VAAVLALSASAEPHK
|
150 TO 164
|
PQSVGPLSVGPLSVG
|
174 TO 188
|
GPQSVGPLSVGPLSV
|
473 TO 487
|
PLSVGLQAVDVSPVS
|
HASPB
|
26 TO 40
|
ANHRGAAGVPPKHAG
|
133 TO 147
|
KEDGRTQKNDGDGPK
|
201 TO 223
|
GPKEDENLQQNDGNAQQNDGNAQEKNEDGH
|
223 TO 237
|
HNVGDGANGNEDGND
|
K39
|
68 TO 82
|
DSEALRGQLEEANAE
|
105 TO 119
|
EALRGQLEEANAEKE
|
360 TO 374
|
KEDSEALRGQLEEAN
|
682 TO 696
|
KEDNEALRGQLEKTT
|
LACK2
|
69 TO 84
|
SCVSLAHATDYALTA S
|
101 TO 115
|
RKFLKHTKDVLAVAF
|
247 TO 271
|
RFWMCVATERSLSVY DLESKAVIAE
|
PSA
|
23 TO 40
|
NDMVITDMNAAAVAFFGW
|
64 TO 78
|
NTVTLMAHLANSTDV
|
264 TO 278
|
CDAFGTVHRMTASMT
|
378 TO 392
|
RDLMDLAKARKVRLI
|
3.1.3. Prediction of B-Cell epitopes
Linear B-cell epitopes were selected by three servers of BCpred, IEDB, ABCpred, the results of which are shown in Table 2.
Table 2: Prediction of B-cell epitopes related to 9 antigenic proteins by 3 different servers
Target Antigens
|
Start–end position
|
Sequences
|
GP63
|
63 TO 72
|
RHHTAPGAVS
|
84 TO 98
|
AAAADRRPGSAPTVV
|
192 TO 203
|
KVPPAHITEGFS
|
300 TO 311
|
TAVAKAREQYGC
|
319 TO 330
|
IEDQGGAGSAGS
|
KMP-11
|
18 TO 24
|
EFNRKMQ
|
20 TO 26
|
NRKMQEQ
|
30 TO 47
|
FFADKPDESTLSPEMKEH
|
48 TO 73
|
YEKFERMIKEHTEKFNKKMHEHSEHF
|
CPB
|
29 TO 88
|
AAAVKDRAKAAAAAATPSGLSKKFSHPSLSSSFERSGAGGTLSKRGSPESTAGACDSDGA
|
100 TO 112
|
ESADEGATTTSHS
|
114 TO 124
|
ASHMLHAPGGC
|
457 TO 483
|
SSSWRPISSWRPIPAASERATSANSPSSAQSPQ
|
H1
|
14 TO 47
|
MTSPQKSPRSSPKKTAAKKAAAKKAGAKKAGAKK
|
30 TO 53
|
AKKAAAKKAGAKKAGAKKAVRKVA
|
68 TO 79
|
KKPAKKVAKKVA
|
78 TO 94
|
VAKKPAKKAAKKPAKKA
|
25 TO 35
|
SAEPHKAAVDV
|
37 TO 40
|
PLSV
|
42 TO 67
|
VGPLSVGPQSVGPLSVGPQSVGPLS
|
86 TO 183
|
VGPLSVGPQSVGPLSVGPQAVGPLSVGPQSVGPLSVDVGPQAVGPQSVGPLSVGPQSVGPLSVGPQSVGPLSVGPLSVGPQSVGSLSVGPQSVGPLSV
|
HASPB
|
77 TO 88
|
KEDGHTQKNDGD
|
121 TO 134
|
DGRTQKNDGDGPKE
|
130 TO 152
|
DGPKEDG RTQKNDGDGPKEDGRT
|
K39
|
41 TO 73
|
LEEANAEKERLQSELEEKGSEAEAAKEDSEALR
|
335 TO 367
|
LEEANAEKERLQSELEEKGSEAAAAKEDSEALR
|
370 TO 382
|
LEEANAEKERLQS
|
LACK2
|
32 TO 57
|
TSRDGTAISWKANPDRHSVDSDYGLP
|
70 TO 104
|
CVSLAHATDYALTASWDRSIRMWDLRNGQCQRKFL
|
259 TO 269
|
SVYDLES KAVI
|
272 TO 281
|
LTPDGAKPSE
|
PSA
|
83 TO 90
|
QTTRDPHA
|
89 TO 102
|
HATVVAWTILPIRL
|
180 TO 190
|
LGRPKKPNANQ
|
186 TO 206
|
PNANQSLKRILPRLQEVLEKE
|
3.1.4. Multi-epitope antigens construction
18 epitopes from 9 antigenic proteins were selected as areas with maximum overlapping between B-cell and MHC-II epitopes (Table 3). The epitopes selected from each protein were combined by linkers GGSG, SSAG, GGGS, GGAG. Construction of final recombinant antigen contains 461 amino acid roots that are shown in Fig. 2.
Table 3: Eighteen epitopes were selected as the final epitope resulted from 9 antigenic proteins based on the results of several servers.
Antigen
|
Start–end position
|
Sequences
|
GP63
|
70-90
|
AVSAVGLPYVTLDTAAAADRR
|
300-320
|
NSSTAVAKAREQYGCDTLEYL
|
CPB
|
100-120
|
ESADEGATTTSHSHASHMLHA
|
460-480
|
WRPIPAASERATSANSPSSAQ
|
A2
|
20-50
|
LALSASAEPHKAAVDVGPLSV
|
140-160
|
PQSVGPLSVGPQSVGPLSVGP
|
HASPB
|
130-150
|
DDGGPKEDGHTQKNDGDGPKE
|
200-220
|
DGDGPKEDGRTQKNDGDGPKE
|
K39
|
50-70
|
RLQSELEEKGSEAEAAKEDSE
|
360-380
|
KEDSEALRGQLEEANAEKERL
|
KMP11
|
20-40
|
NRKMQEQNAKFFADKPDESTL
|
50-70
|
KFERMIKEHTEKFNKKMHEHS
|
LACK2
|
70-90
|
CVSLAHATDYALTASWDRSIR
|
260-280
|
VYDLESKAVIAELTPDGAKPS
|
PAS
|
80-100
|
VVIQTTRDPHATVVAWTILPI
|
180-200
|
LGRPKKPNANQSLKRILPRLQ
|
H1
|
30-50
|
AKKAAAKKAGAKKAGAKKAVR
|
70-90
|
PAKKVAKKVAKKPAKKAAKKP
|
3.1.5. Allergenicity and antigenicity evaluation
The results obtained by AlgPred and AllerTOP 2.0 servers showed that the designed structure is non-allergic. Prediction of antigenicity of multi-epitope antigens was conducted by ANTIGENpro 0.931519, which means that our multi-epitope antigens can stimulate the body humoral and cellular immunity responses.
3.1.6. Physicochemical parameters and protein solubility evaluation
The molecular weight, number of amino acids, PI theory, aliphatic index, Grand average of hydropathicity (GRAVY), total number of negative residues (Asp + Glu), total number of positive residues (Arg + Lys), and total number of residues in both the positive and negative categories were obtained using the ProtParam server. Additionally, the probability of multi-epitope antigens solubility was determined using the Solpro server and is shown in Table 4.
Table 4: Physicochemical properties and solubility of designed structure of multi-epitope antigens
Physicochemical properties
|
Result
|
Number of amino acids
|
461
|
Molecular weight
|
46355.10
|
Theoretical Pi
|
8.96
|
Total number of negatively charged residues (Asp + Glu)
|
56
|
Total number of positively charged residues (Arg + Lys)
|
62
|
Total number of atoms
|
6451
|
instability index (II)
|
38.51
|
Aliphatic index
|
56.88
|
Grand average of hydropathicity (GRAVY)
|
-0.734
|
Solubility
|
0.939126
|
3.1.7. Secondary structure analysis
Secondary structure predicted by PSIPRED server was shown in Fig. 3. The results of GOR IV server showed that our designed protein consisted of 34.06% alphahelix, 13.45% extended strand, 52.49% random coil which are the main elements of secondary structure.
3.1.8. Tertiary structure analysis
Five models with C-score value (-5 to 2) were proposed by I - TASSER server. The model with higher C-score is the best model: therefore, the model one with C-score value of -0.82 was selected for further assessments (Fig.4).
3.1.9. 3D structure validation and refinement of the 3D structure
Selected model for validation stage was refined by GalaxyRefine server. Ramachandran analysis was also performed before and after refinement processes. In the initial model, the analysis results of Ramachandran plot showed that the number of residues in favored and allowed region and outlier region were 256 (55.8%), 132 (28.8%) and 71 (15.5 %), respectively. After refinement of 3D model, the analysis results of Ramachandran showed that the number of residues in favored and allowed region and outlier region were 371 (80.8%), 63 (13.7%) and 25 (5.4 %), respectively. Moreover, the potential errors and quality of initial and refined 3D model were evaluated by ERRAT and Verify 3D servers. The results of ERRAT showed that the overall quality factor of initial 3D model is 77.6699. Based on Verify 3D score 69.85% of residues had an average 3D-1D score greater than 0.2. After refinement of 3D model, the results obtained from ERRAT coefficient and Verify 3D score were 61.017% and 61.17%, respectively (Fig. 5).
According to mentioned results, the quality of 3D structure is improved after refinement.
3.1.10. Codon optimization and in silico cloning
The DNA sequence was supplied to the JCAT server after the reverse translation of the protein via the SMS server in order to track important factors that influence the protein expression surface in the E. coli host. According to the JCAT server results, the optimized nucleotide sequence's CAI was 1.0, the optimum value for expression in the host E. coli strain. Additionally, the average GC concentration of DNA sequences was 52.5%, although the optimal range for GC content is between 30% and 70%. The expression of proteins will suffer from any outlier area. The CFD value for the 100% complete gene sequence is evaluated by the GenScript server. The efficiency of transcription and translation is decreased by CFD by 30%. Finally, the findings demonstrated that the host's ideal DNA sequence was present at the highest level.
3.2. Experimental studies
3.2.1. Confirmation of subcloning of the multi-epitope antigens fusion in pET-26b expression vector
In the first step, we cultured one of the clones of the puc57R recombinant plasmid. The plasmid extracted from this culture was digested with BamHI and HindIII enzymes. After enzymatic digestion, the separation of the recombinant gene fragment from the cloning vector was observed through electrophoresis in agarose gel (Fig. 6). The separated fragment was purified from the gel using a purification kit. In the second step, plasmid pET-26b was cultured. The plasmid extracted from this culture was digested with BamHI and HindIII enzymes and purified. In the third step, the recombinant gene fragment was added to the digested pET-26b plasmid by T4 DNA ligase enzyme. The fusion product was transfected into competent cells of E. coli strain TOP10. The transformed cells were cultured in order to increase the recombinant plasmids.
3.2.2. Colony PCR and enzymatic digestion to confirm cloning
The recombinant gene construct was successfully subcloned into the pET-26b expression vector. By using colony PCR and enzymatic digestion, the existence of recombinant gene fusion in the pET-26b vector was discovered.
The plasmid in the colonies was utilized as a template in the Colony PCR technique so that the recombinant gene could be amplified using primers that were unique to it. Following PCR product electrophoresis, the 400 bp fragment confirmed the insertion fragment synthesis process on the pET26b vector (Fig. 7).
The recombinant gene fragment should be acquired from the digestion of the recombinant plasmid with these two enzymes as the cutting sites of BamHI and HindIII enzymes are at both ends of the recombinant gene. As a result, the BamHI and HindIII enzymes were used to cut the recombinant plasmids that were verified by the Colony PCR technique. The end result of this digestion was visible on an agarose gel as a band of 1383 bp (insert) and a band associated with the linear plasmid pET-26b (Fig. 8).
3.2.3. Expression of multi-epitope antigens
The recombinant vector was transferred to the E. coli BL21 expression host after the cloning of the recombinant gene in the pET-26b plasmid vector was verified. Two pET-26b colonies with recombinant genes, two pET-26b colonies devoid of recombinant genes, and two colonies of BL21 bacterial cells were assessed among the acquired colonies. The expression of the recombinant gene before and after induction was examined on a 12.5% SDS-PAGE gel following the cultivation of the colonies and induction with IPTG (1 mM final concentration). At 37 °C after induction, 1 mM IPTG concentration, and 16 hours after induction, the recombinant protein successfully expressed itself. A 46 kDa band associated with the production of the recombinant protein was identified using SDS-PAGE analysis. Fig. 9 depicts the expression results of six colonies both before and after induction. Additionally, a western blot analysis utilizing an anti-His antibody verified the recombinant protein's identification (Fig. 10).