The primary focus of this study is to explore the novel therapeutic targets or vaccine candidates against H. pylori by employing a structural genomics approach. We used an unreviewed hypothetical proteome set for analysis through several bioinformatics databases and computational biology tools. The entire findings of subtractive proteomics study were briefly summarized in Figure 1.
Candidate protein for vaccine design
Analysis of the H. pylori hypothetical proteins
A preliminary prediction for the functional annotation was carried out by using the GO FEAT platform. From the total set of H. pylori 26695 reference proteome (1, 115 proteins), a set of 944 unreviewed hypothetical proteins were used for the analysis. Further, the proteins of the known domain and/or families and their GO terms were selected (542 proteins) for further analysis (Supplementary File 1). These functionally annotated proteins may play an important role in the cell and are thus labeled as hypothetical proteins (HPs). The functional annotation of HPs assist in gaining the knowledge of structure, function and pathways abetting in the pathogenesis of bacterium and thus crucial in identifying novel therapeutic targets. The human homologous proteins in the pathogen and proteins in metabolic pathways associated with pathogen and host in common were determined using various web-based bioinformatics resources.
Selection of non-homologous human proteins
All the hypothetical protein sequences analyzed for functional domain/family were then screened only for the non-homologous sequences. BLASTp with an e-value of 10-3 threshold against H. sapiens from NCBI was performed. This prompted us to acquire a set of 412 non-homologous proteins from H. pylori (Supplementary File 1).
Essential protein analysis
A total of 77 essential genes were identified by performing BLASTp against the DEG database (e-value<0.0001). These proteins considered crucial for H.Pylori survival are unique (Supplementary File 1) and are believed to be in identifying species-specific drug targets/vaccine candidates (44).
Human Gut microbiota analysis
The inadvertent blockage of the gut floral proteins due to homologous proteins of the pathogen may lead to hostile effects (31). To avoid this, homologous gut microbial proteins which are similar to essential proteins of H. pylori were omitted for further analysis. This step is accomplished using BLASTp by choosing the search set against human gut metagenome 16S ribosomal RNA with an expected threshold at 0.05. Found no significant matches suggesting that the entire protein set of essential proteins are unique for the pathogen.
Analysis of metabolic pathways
We retrieved a total of 342 human metabolic pathways and 95 H. pylori-specific metabolic pathways from the KEGG database (Supplementary file 2). For screening novel therapeutic targets, the proteins exclusively involved in the pathways specific for the pathogen were considered. In our study, the set of 77 essential proteins were submitted to KAAS server for assigning KEGG ontology (KO) and specific metabolic pathways. A total of fifty-five proteins were assigned with KO and forty-two proteins were involved in pathways common to H. sapiens. These forty-two proteins were omitted for further screening to circumvent cross-reactivity with other human pathogens. Finally, we arrived at conclusion with twelve unique proteins found to be involved in pathogen-specific pathways (Table 1).
Analysis of subcellular location
The prediction of protein localization serves as a vital parameter in identifying therapeutic targets because many pathogens can span multiple locations (45). Among 12 proteins, only two were identified as inner membrane proteins, 07 as cytoplasmic and the remaining 03 as periplasmic proteins (Table 2).
Analyzing druggability of hypothetical protein
The novelty of the membrane proteins as a drug target was analyzed using the DrugBank database (Supplementary File 3). The proteins with significant similarity in the database were excluded and the penicillin-binding protein 1A, exhibited similarity above the threshold under consideration. But, the second protein Lipid A-4’-phosphatase (LpxF), reported no similarity with any of the current drug targets in the database.
‘Anti-target’ analysis of the novel drug target
In view of inadvertent side effects, various drug candidates were either withdrawn or reduced their usage except under extreme situation. The cross-reactivity and carcinogenesis check is crucial in selecting an effective drug molecule (46-48). The toxicity caused by the misconstrued binding of drugs to host ‘anti-targets’ instead of pathogenic targets must be avoided. In this concern, this result revealed no similarity with any of the human ‘anti-target’ proteins (Supplementary File 4) and thus lpxF is considered as the host ‘non-anti-target’ protein.
Antigenicity and allergenicity prediction
The reverse vaccination method is considered one of the powerful approaches in designing a candidate vaccine (49-50). The small antigenic protein sequences were considered for developing a safe recombinant vaccine with the potency to fight against infectious diseases (51). The identified novel drug target of H. pylori, LpxF, is subjected to the VaxiJen v2.0 server. The results revealed that the LpxF was the probable antigen protein of H. pylori with a score of 0.5232 (threshold:0.4). Further, this membrane protein can be used to detect the high immunodominant peptide to develop an efficient subunit vaccine against H. pylori.
Conservancy analysis of H. pylori 26695lpxF sequence with other strains
A very high conservancy pattern of lpxF was found among various strains of H. pylori. This wide range of conservatives found for the predicted lpxF ensures it as a potential drug and/or vaccine target against H. pylori (52). The results of pBLAST showing lpxF conservancy from all H. pylori strains from diverse geographical locations were tabulated in Supplementary File 5. Hence, lpxF protein might be a potential vaccine candidate/therapeutic target against H. pylori.
Virulence factors of pathogenic H. pylori
The virulence mechanism of the non-homologous, essential protein can be explored by submitting it to the virulence factor database. For the query protein lpxF, the virulence factors retrieved were listed in Table 3.
Peptide vaccine discovery
B cell epitopes prediction.
Three linear B-cell epitopes were predicted by BCPred with a score value of ≥0.90 and length of each epitope with 20 amino acids. Another B-cell epitope prediction server from IEDB identified epitopes based on five different methods with all default parameters and were listed in Table 4.
The Surface accessibility of H. pylori lpxF is predicted based on the threshold value >1. The amino acids that fall above this value are probably considered their presence on the protein surface. Here, the maximum surface probability score was found to be 11.053 for FTSRYKPKRWML165-176. Figure 2A depicts the expected surface accessibility of H. pylori, while Table S1 (Supplementary Material) lists the maximum and minimum accessibility scores. Karplus and Schulz study on surface flexibility of H. pylori lpxF has revealed a highly systematic and disordered structure indicated by the low and high b-factor values. The maximum predicted surface flexibility score is 1.107 for FKGSSRY184-190. The predicted surface flexibility of H. pylori lpxF is graphically represented in Figure 2B and the minimum and maximum scores were shown in Table S6 (Supplementary Material).
The Parker approach was used to predict the hydrophobicity of the predicted epitopes from H. pylori (53) and was graphically illustrated in Figure 2C. The maximum and minimum hydrophobicity calculated was 5.329 and -7.071, respectively, from all the predicted peptides for lpxF at the amino acid residue positions STAHKDG79-85 and FLSLLLW8-14, and predicted to act as active B-cell epitopes.
Prediction of Cytotoxic and HTC epitopes
NetCTL 1.2 was used to predict the cytotoxic epitopes for lpxF protein. A total of eight CTL epitopes were predicted based on the defined criteria and specific MHC binding score (Table 5). These potential epitopes for MHC class I molecules against the HLA-A*24:02 allele was predicted using the SMM method. Further, MHC-I binding and proteasome-dependent C-terminal cleavage of lpxF were also predicted based on weight matrix and artificial neural network. Finally, the prominent epitopes were predicted based on the MHC binding affinity, the TAP score and the C-terminal cleavage score.
Structural modeling peptide and molecular docking studies
A total of thirty-five epitopes were shortlisted by combining B-cell and T-cell epitopes from the lpxF protein (Table S7, Supplementary Material). The selected top five epitopes from the protein were re-docked with an exhaustive parameter set to 100 for better conformation search. This led to slection of two peptides with the best conformational binding towards the TLR2 target protein and identified the interaction pattern between the peptide-TLR2 complexes. The epitope AGFVYYR (PEP_19) was able to bind TLR2 with docking energy of -6.9 kcal/mol. The residues Phe3, Tyr5, Tyr6 and Arg7 from the PEP_19 epitope were found in interaction with and Lue119, Asn137, Phe144 and Asn143 from TLR2. Similarly, the docking energy for binding epitope FAYLFTSRY (PEP_31) with TLR2 was also calculated as -6.9 kcal/mol. The residues Phe1, Tyr3, Phe5, Thr6 and Ser7 from the PEP_31 epitope interacted with Asn265, Ser184, His159 and Val154 of TLR2 receptor protein. The hydrogen bonding and hydrophobicity for the docked complexes were analyzed using the Discovery Studio tool.The pattern of interaction was demonstrated in Figures 3 and 4.