Analysis of mutation patterns
The UK strain (20I/501Y.V1 or lineage B.1.1.7) is originated by the replacement of amino acid asparagine with tyrosine at position 501 of the RBD subunit of the spike protein along. Several other mutations including 69/70 deletion as well as P681H near the S1/S2 furin cleavage sites were also seen. The South Africa strain (20H/501Y.V2 or B.1.351) was detected to have numerous mutations in the spike protein, including K417N, E484K, and N501Y. The first Brazil P.1 strain of SARS-CoV2 contains 10 spike protein mutations N501Y, E484K, and K417T along with 17 other unique mutations. In contrast, the second Brazilian P.2 lineage was also identified as having three mutations in spike proteins namely E484K, N501Y, and K417T. As per evidence, the E484K mutation has been noticed in the South African variant but not in the UK variant (10). The Indian B.1.617 variant consists of two mutations namely E484Q and L452R in the spike glycoprotein which were already in circulation globally. Because of the current surge or worst condition, scientists are worried that B.1.617 variant is not a double mutant but it may consist of more mutations including E154K, P681R, and Q1071H. The three mutations E484Q, L452R, and P681R have been found in other variants of concern from the UK, South Africa, and Brazil. The mutation E484Q is similar to the E484K mutation that was previously seen in the Brazilian and South African variants. On the other hand, L452R is similar to the California variant, an immune escape strain, thus, may affect the vaccine efficacy. Further, the P681R mutation showed similarities to that of mutations seen in the United Kingdom variant. Furthermore, other strains that are present in many countries exhibited different mutations in their spike glycoproteins as depicted in table 1.
Table 1: Analysis of presence of mutations in spike proteins of various emerging strains of SARS-CoV2
Mutations
in spike protein
|
B.1.1.7 UK strain
|
P.1 Brazil
|
B.1.429 Cali
fornia
|
B.1.351 South Africa
|
B.1.427 Cali
fornia
|
B.1.526 New York
|
B.1.526.1 New York
|
B.1.526.2 New York
|
P.2 Brazil
|
B.1.617 India
|
DEL69/
70.0
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
DEL144/
144.0
|
+
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
N501Y
|
+
|
+
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
A570D
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
D614G
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
P681H
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
T716I
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
S982A
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
D1118H
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
L18F
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
T20N
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
P26S
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
D138Y
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
R190S
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
K417T
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
E484K
|
-
|
+
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
H655Y
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
T1027I
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
S13I
|
-
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
W152C
|
-
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
L452R
|
-
|
-
|
+
|
-
|
+
|
-
|
+
|
-
|
-
|
+
|
K417N
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
A701V
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
-
|
-
|
-
|
D253G
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
+
|
-
|
T95I
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
+
|
+
|
-
|
F157S
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
T859N
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
D950H
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
D80G
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
D253G
|
-
|
-
|
-
|
-
|
-
|
+
|
-
|
-
|
-
|
-
|
S477N
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
+
|
-
|
L5F
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
+
|
-
|
Q957R
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
+
|
-
|
P681R
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
E484Q
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
E154K
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
+
|
Generation of 3D Protein Structures: Validation and Analysis
Homology modelling is a key procedure to predict the protein structures relating to their functions. The SWISS-MODEL server is used to design the protein structure by the use of the homology modelling method. Based on query coverage, identity and GMQE scores, the BLAST and HHblits search revealed that PDB IDs 6ZWV and 7KRS may act as the best possible template match in the homology modelling for spike glycoprotein of Brazil Strain P.1 and Indian B.1.617 strains with 100% query coverage respectively. The sequence identity (or sequence similarity) was found to 99.29 (or 0.62) and 99.14 (or 0.62) for Brazil P.1 and Indian B.1.167 strains respectively. Furthermore, the quaternary structure quality estimate (QSQE) values were 0.93 for Brazil P.1 strain and 0.78 for Indian B.1.617 strain. The values are mentioned in table 2
Table 2: The properties and their values of generated 3D structures of spike glycoproteins of new strains
Properties
|
Brazil Strain P.1
|
Indian Strain B.1.617
|
Template
|
7krs.1.A
|
6zwv.1.A
|
Oligostate
|
Homo trimer
|
Homo trimer
|
QSQE
|
0.93
|
0.78
|
Found by
|
HHblits
|
HHblits
|
Sequence identity
|
99.29
|
99.14
|
Sequence similarity
|
0.62
|
0.62
|
Range
|
14-1160
|
2-1151
|
Coverage
|
1.00
|
1.00
|
GMQE
|
0.72
|
0.66
|
QMEAN
|
-1.51
|
-2.25
|
The overall molecular structures of Brazil P.1 and Indian B.1.617 strains were inconsistent with the crystal structure of spike glycoprotein of wild type of SARS-CoV2 (PDB ID: 3M3V).
The analyses of the Ramachandran plot also revealed that 99.9% of residues were in the allowed region while only 0.1% of residues were in the outlier region (Figure 1)
Evaluation and Analysis of docking studies
The four new strains such as Indian B.1.617 variants, UK B.1.1.17 variant, South Africa B.1.351 variant and Brazil P.1 variant have been emerged by mutations in spike glycoprotein of SARS-CoV2 which interact with the transmembrane protein (ACE2) of human cell receptor. The interactions between S-protein and ACE2 is the key point to enter the virus into host cells, thereafter facilitate the replication of the viral genome. Therefore, inhibition of Spike and ACE2 proteins are the therapeutic targets for the anti-COVID-19 inhibitors. Now it is essential to search for novel compounds as potential SARS-CoV2 inhibitors targeting the inhibition of spike glycoprotein. In continuation of our research to find the phytochemicals as SARS-CoV2 inhibitors, we prepared a library that consists of 100 phytochemicals possessing potential biological activities against different types of viruses and screened them against the spike glycoprotein of the aforementioned new strains of wild SARS-CoV-2. Recently, several repurposing studies suggested that some already FDA-approved drugs (remdesivir, favipiravir, nafamostat etc.) could be used to treat COVID-19. Thus, herein, we selected nafamostat amongst them as a control to compare our results because it is a spike glycoprotein inhibitor. We performed preliminary docking screening of all 100 phytochemicals and identified some of them as potential inhibitors that are predicted to bind within the receptor binding pockets of four new strains (Table 3; Fig. 2) by performing molecular docking using Glide v8.8 (Schrodinger, LLC, New York) software, a computational tool. As discussed above, a threshold value (≥ -7.00) of docking score was selected for further analysis. Thus, to this, the top three phytochemicals against each of the new strain (Indian B.1.617, UK B.1.1.17, South Africa, and Brazil P.1 Brazil Strains) were found to pass the threshold value as summarized in table 3.
Examining the docking study of phytochemicals with spike protein of B.1.617 Indian variant, the docking score revealed the binding order as rutin > EGCG > hesperidin. Rutin docked with a docking score of -8.160 by forming seven hydrogen bonds with Ser50, Cys299, Thr300, Asp735, Thr737, Gly755 and Thr759 amino acid residues. Two hydrophobic bonds were observed with Leu51 and Leu52 amino acid residues. EGCG docked with docking score -7.993 while interacting with Thr272, Thr300, Thr313, Asn315, Thr737, Thr759, Asn762, Leu751 and Leu752 amino acid residues by forming seven hydrogen and two hydrophobic bonds respectively. Moreover, hesperidin docked with docking score -7.873, showing hydrogen bond as well as hydrophobic bond interactions at active sites with Gln52, Ser314, Gly755, Thr759, Leu301, Leu752 and Phe757 amino acid residues. Docking analysis of phytochemicals with spike protein of B.1.1.17 UK strain revealed that hesperidin, withanolide G and rosmarinic acid docked with docking score of -8.993, -8.766 and -8.761 respectively. Hesperidin formed seven hydrogen bonds with Thr547, Thr573, Asp745, Asn856, Leu977, Asn978 and Arg1000 amino acids of target protein whereas ten hydrophobic bonds were also found through interactions with Val320, Pro322, Phe541, Leu546, Ile587, Pro589, Cys590, Met740, Val976, and Leu977 amino acids. Moreover, withanolide G exhibited molecular interactions with Thr549, Asn856, Asn978, Pro322, Val320, Phe541, Leu546, Ile587, Pro589, Cys590, Phe592, Met740 and Val976 amino acid residues whereas Thr547, Thr549, Thr573, Tyr741, Asn856, Asn578, Arg1000, Phe541, Leu546, Ile587, Pro589, Cys590, Met740, Ile742, Cys743, Leu966, Val976 and Leu977 amino acid residues were found to interact with the hydroxyl and phenyl functional moieties present in rosmarinic acid.
Table 3: Detail account of phytochemical, their docking score and molecular interactions with the amino acid residues at binding sites
Strains (Spike glycoprotein)
|
Phytochemicals
|
Docking score
|
Molecular interactions with the amino acids
|
Hydrogen bonds
|
Hydrophobic bonds
|
B.1.617
Indian variant
|
Rutin
|
-8.160
|
Ser50, Cyx299, Thr300, Asp735 Thr737, Gly755, Thr759
|
Leu51, Leu52
|
EGCG
|
-7.993
|
Thr272, Thr300, Thr313, Asn315, Thr737, Thr759, Asn762
|
Leu751, Leu752
|
Hesperidin
|
-7.873
|
Gln52, Ser314, Gly755, Thr759
|
Leu301, Leu752, Phe757
|
Nafamostat
|
-5.665
|
Leu301, Asn315, Asp735, Asn762
|
-
|
B.1.1.7 UK strain
|
Hesperidin
|
-8.993
|
Thr547, Thr573 Asp 745, Asn856, Leu977, Asn978, Arg1000
|
Val320, Pro322, phe541, leu546, Ile587, pro589, Cys590, Met 740, Val976, Leu977
|
Withanolide G
|
-8.766
|
Thr549, Asn856, Asn978
|
Pro322, Val320, Phe541, Leu546, Ile587, Pro589, Cys590, Phe592, Met740, Val976
|
Rosmarinic acid
|
-8.761
|
Thr547, Thr549, Thr573, Tyr741, Asn856, Asn578, Arg1000
|
Phe541, Leu546, Ile587, Pro589, Cys590, Met740, Ile742, Cys743, Leu966, Val976, Leu977
|
Nafamostat
|
-5.340
|
Val320, Asp571, Leu966
|
Pro322, Pro589, Cys590, Met740, Leu966, Val976, Leu977
|
B.1.351 South Africa strain
|
EGCG
|
-8.369
|
Thr549, Thr573, Asp574, Met740, Gly744, Asn856
|
Phe541, Ile569, Ala570, Ile587, Pro589, Tyr741, Phe855, Val976
|
Diosmetin
|
-8.200
|
Thr549, Thr573, Ile587, Met740 Asn856
|
Phe541, Ala570, Pro589, Tyr741, Cys743, Leu966, Val976, Leu977
|
Myricetin
|
-8.102
|
Thr549, Thr573, Met740, Gly744
|
Phe541, Ile587, Pro589, Tyr741, Cys743, Leu966, Val976, Leu977
|
Nafamostat
|
-4.260
|
Asp574, Asp745, Asn978, Phe855
|
Ile587, Pro589, Leu977, Leu981
|
P.1 Brazil Strain
|
Rosmarinic acid
|
-9.235
|
Tyr369, Phe374, Phe377, Lys378, Asp405, Glu406, Gln409, Thr415, Thr417
|
Leu368, Ala372, Tyr495
|
Epicatechin
|
-8.248
|
Tyr369, Phe374, Arg408, Gln409, Thr415
|
Ala372, Phe377, Pro384
|
Quercetin
|
-7.925
|
Tyr369, Ser371, Phe374, Phe377, Lys378, Arg408, Gln409, Thr417
|
Leu368, Ala372
|
Nafamostat
|
-5.325
|
Tyr369, Ser371, Phe374, Lys378, Glu988
|
Ala372, Pro384, Phe377, Val987
|
In the case of B.1.351 South Africa strain, the phytochemicals EGCG, diosmetin and myricetin showed strong binding affinities. EGCG docked with docking score -8.369 which was attributed to the presence of six hydrogen bonds as well as eight hydrophobic bonds with Thr549, Thr573, Asp574, Met740, Gly744, Asn856, Phe541, Ile569, Ala570, Ile587, Pro589, Tyr741, Phe855 and Val976 amino acids respectively. Additionally, diosmetin docked with docking score -8.200 via interacting with Thr549, Thr573, Ile587, Met740, Asn856, Phe541, Ala570, pro589, Tyr741, cys743, leu966, Val976, and Leu977 amino acid residues whereas myricetin also displayed a significant docking score (-8.102) through molecular interactions with Thr549, Thr573, Met740, Gly744, Phe541, Ile587, Pro589, Tyr741, Cys743, Leu966, Val976 and Leu977 amino acid residues. Furthermore, another study of protein-ligand interactions revealed that rosmarinic acid binds with a higher dock score (-9.235) at the active sites of the spike protein of P.1 Brazil strain followed by the epicatechin (-8.248) and quercetin (-7.925). Rosmarinic acid exhibited the nine-strong hydrogen bond interactions with Thr369, Lys378, Asp405, Glu406, Gln409, Thr415, Phe374, Phe377, Thr417, amino acid residues as well as three hydrophobic bonds with Leu368, Ala372 and Tyr495 amino acid residues. On the other hand, epicatechin and quercetin exhibited molecular interactions with Tyr369, Ser371, Phe374, Phe377, Lys378, Arg408, Gln409, Thr417, Leu368 and Ala372 amino acid residues respectively. On comparing the results with nafamostat, a reference drug, it was quite significant that the top three phytochemicals among the selected ones had been found to have promising binding affinities with better binding scores -5.665, -5.340, -4.260, -5.325 against all new strains B.1.617 Indian variant, B.1.17 UK strain, B.1351 South Africa and P.1 Brazil strain respectively. The binding pose of the best phytochemical within the binding pocket of the spike protein of each of the new strains is collectively shown in Fig. 3. Thus, the present work revealed that rutin, hesperidin, EGCG, and rosmarinic acid displayed the potential inhibition against the spike protein of B.1.617 Indian variant, B.1.17 UK strain, B.1351 South Africa and P.1 Brazil strain respectively and they may serve as promising leads for further optimisation and drug development process to manage the COVID-19.