By applying the graph clustering algorithm DPClusO to the sequence similarity network of spike proteins, we found 5 non-overlapping clusters (Fig. 2), This result implies that in the context of sequence the spike proteins of covid-19 viruses can be classified into five major groups and they should be treated in different ways possibly by using different drugs. Cluster 1 is formed with Spike glycoprotein, Collagen alpha-1(I) chain type proteins. Cluster 2 is formed with S1 type Spike proteins. Cluster 3 is formed with S2 type Spike proteins. Cluster 4 is formed with S2 and S2' types of Spike proteins and pan-CoVs inhibitor EK1 type proteins. Cluster 5 is formed with S2' type Spike proteins. We did not find any predicted secondary metabolites for cluster 3 and cluster 5. The medicinal properties and biological significance of some of the secondary metabolites corresponding to clusters 1, 2, and 4 are described below. In case of almost all of these metabolites I(S) ≥ 0.5, T(D, M) = 1 and therefore W(S,D,M) ≥ 0.5.
Cluster 1
C00035257(Chichipegenin) is found in Myrtillocactus geometrizans plant of cactus of central and northern Mexico. The ethnic group of Otomi and Mixtec used this medicinal plant as an anti-inflammatory remedy31. An experiment of Chichipegenin on a mouse model shows the inhibition of the TPA-induced edema in a dose-dependent manner, with ED50 values less than or equal to that shown by indomethacin32. C00003684 (Cucurbitacin C), C00003683 (Cucurbitacin B) is triterpene found in the members of Cucurbitaceae and several other plant families have immense pharmacological importance33,34. Cucurbitacin C has anti-tumor effects35. Cucurbitacins B inhibits the cyclooxygenase (COX)-2 enzymes with no effect on COX-1 enzymes and promotes anti-inflammation36. It also inhibits telomerase via down-regulating both the human telomerase reverse transcriptase and c-Myc expression in breast cancer cells34. The similar class compounds i.e Cucurbitacin I, Cucurbitacin C, Cucurbitacin D, Cucurbitacin A also have Anti-inflammatory, Antitumor activity, Anti-artherosclerotic, Antidiabetic activity33,34. C00001849 (Emetine) is an alkaloid found in different plant species of Hedera helix, Alangium longiflorum, Cephaelis ipecacuanha, Psychotria burucana, Psychotria ipecacuanha, Psychotria klugii, Carapichea affinis, Cephalis acuminate, Cephalis acuminate, Cephalis ipecacuanha and, Uragoga ipecacuanha. It inhibits both ribosomal and mitochondrial protein synthesis and interferes with the synthesis and activities of DNA and RNA. Due to these vital properties pharmacologists used it as antiviral, anticancer, antiparasitic and contraceptive activities37. In past it was used to treat the Spanish influenza in the last stage of the pendemic38. A recently published literature suggested it as a potential drug to inhibit the SARS-COV2 virus38,39. C00003211 (Helenalin Acetate) found in plant Balduina angustifolia has anti-inflammatory and anti-cancer activities. Helenalin Acetate reduces the phospholipase A2 activity and inhibits human neutrophil migration, chemotaxis, and platelet aggregation40,41. The inhibitory effect of helenalin on the biosynthesis of leukotrienes promotes anti-inflammation40. The CCAAT box/enhancer-binding protein β (C/EBPβ) is an interesting target for the development of small-molecule inhibitors. Helenalin acetate is known to inhibit NF-κB also inhibits C/EBPβ by binding to the N-terminal part of C/EBPβ, thereby disrupting the cooperation of C/EBPβ with the co-activator p30042. C00021118 (Bigelovin) found in plant species Dittrichia graveolens (L.) GREUTER and Helenium scorzoneraefolia has been used as a cancer treatment in Yuan province of China43. Bigelovin can suppress the proliferation and production of Th1 cytokines (IFNγ, IL-2, and IL-12) of human peripheral blood mononuclear cells (PBMCs)43. In zebra fish embryos, it can inhibit the growth of sub intestinal vessels and down-regulate the expression of Ang2 and Tie243. Another study suggested the Bigelovin as a CRC treatment that can suppress cell proliferation and colony formation and induce apoptosis in human CRC HT-29 and HCT 116 cells44. C00032703 (Amphidinolide B) found in Amphidinium sp. Has shown potent toxicity against tumor cell lines45. C00011307 (Chaetoglobosin A) found in different fungi of Chaetomium globosum, Chaetomium mollipilium, Chaetomium rectum, Chaetomium subaffine and, Penicillium expansum has a broad range of biological activity like antitumor, antifungus, antibacterial, phytotoxicity, nematicidal and fibrinolytic activity46. C00032469 (Uzarigenin) found in different plant species of Asclepias curassavica L, Calotropis gigantea and, Nerium odorum has been used as an anti-diarrheal and anticancer drug in Africa47,48. Uzarigenin is a Na(+)/K(+)-ATPase inhibitor47.
C00001736 (Harmane) is found in both plant and animal species. Some of the plant species are Carex brevicollis DC, Elaeagnus angustifolia L., Crocus sativus, Banisteriopsis caapi(Spr. Ex Briesb.) and, Grewia bicolor Juss. Harmane is a potent neurotoxin that shows 1000-fold selectivity for I1-Imidazoline receptor (IC50 = 30 nM) over α2-adrenoceptor (IC50 = 18 µM). It is also selective inhibitor of monoamine oxidase (MAO) (IC50 = 0.5 and 5 µM for human MAO A/B, respectively)49–52. C00006925 (Isoliquiritigenin) is one of the important metabolites in our list found in numerous species of plants. Some of these plants are Allium chinense, Crinum bulbispermum Milne, Pancratium maritimum L., Dahlia variabilis, Dahlia variabilis, Glycyrrhiza uralensis, Glycyrrhiza glara, Dalbergia sericea and, Oxytropis pseudoglandulosa. The dried roots and rhizomes of Glycyrrhiza uralensis or G. glabra have been used for centuries in traditional medicine as a treatment of coughs and influenza, and detoxification53. The antibacterial activity of Isoliquiritigenin against Mycobacterium tuberculosis, Mycobacterium bovis, Staphylococcus aureus, Staphylococcus epidermidis, and Staphylococcus hemolyticus is presented in 53–56. The anti-asthma formula, ASHMI contains Isoliquiritigenin which can suppress CA-stimulated synthesis of Th2 cytokine and levels of IL-4 and IL-5 in D10 cell culture media supernatants in a concentration-dependent manner without affecting cell viabilities57. Besides these activities, this metabolite shows Anti-inflammatory, Estrogen Receptor Signaling, Anti-periodontitis, Anti-diabetic, Anti-osteoporosis, Hepatoprotective, Anti-mutagenic, and Anti-cancer Activities53.
Cluster 2
C00042440(Cytidine) belongs to the organic compound class pyrimidine nucleosides. The plant species Fritillaria cirrhosa, Morinda citrifolia and Capsicum annuum produce this metabolite. This compound class consists of a pyrimidine base attached to a ribosyl or deoxyribosyl moiety. Cytidine controls neuronal-glial glutamate cycling, affecting cerebral phospholipid metabolism, catecholamine synthesis, and mitochondrial function58. C00027138 (Demecolcine) is an alkaloid found in different plant species of Andocymbium palaestinum, Androcymbium melanthioides var.stricta Baker., Colchicum autmnale, Colchicum autumnale L., Colchicum brachyphyllum, Colchicum hierosolymitanum, Colchicum crmegaphylla, Colchicum speciosum Stev., Colchicum tunicatum, Colchicum turicum, Gloriosa superba, Merendera jolanta, Merendera kurdica, Merendera manissadjianii, Merendera robusta, and Merendera sobolifera. Demecolcine is used in scientific research of cell mitosis. It is a microtubule-depolymerizing drug that can bind to microtubule plus end to suppress microtubule dynamics at very low concentration59. It is also used as a cancer treatment59,60. C00027138(5,6,2',6'-Tetrahydroxy-7,8-dimethoxyflavone) found in plant species of Scutellaria prostrata. This plant is used as herbal medicine in China, Nepal, and India61,62.
Cluster 4
Metabolite C00015518 (SB-203207) found in Bacteria species Streptomyces, shows the strongest possibility among all our predicted metabolites with I(S) = 0.6, T(D, M) = 1 and W(S, D, M) = 0.6. This metabolite can inhibit isoleucyl tRNA synthetase (IRS) from rat liver and Staphylococcus aureus with IC50 values < 2 nM and 1.7 respectively63. Another Metabolites Streptomyces C00018127 (Staurosporine) with W(S, D, M ) = 0.4 and T(D, M) = 1 found in our experiment has the biological property to inhibit the protein kinases through the prevention of ATP binding to the kinase. Staurosporine is a prototypical ATP-competitive kinase inhibitor that binds to many kinases with high affinity and little selectivity64. Staurosporine is a precursor to the protein kinase inhibitor midostaurin (PKC412), K252a. It has cell apoptosis property and has been used as the treatment of cancer and inflammation65,66.
Some structurally similar metabolites with lower W(S, D, M) value might be important as a group. Therefore, to extract group level information, we applied biclustering to a bipartite part of the network of Fig. 1(b) consisting of binding molecules and metabolites and identified two biclusters (Fig. 5). The biological activities of the metabolites included in these two biclusters are described below.
Bicluster 1
Bicluster 1 shown in Fig. 5(b) is formed by two binding molecules and six secondary metabolites. Both binding molecules have the attachment attributes of I(S) = 0.3 and 0.93 ≤ T(D,M) ≤ 1 corresponding to the Spike protein S2',pan-CoVs inhibitor EK1. 00036799 (beta-Homonojirimycin), C00036798 (beta-Homomannojirimycin), C00042220 (alpha-Homonojirimycin) and C00036713 (alpha-Homomannojirimycin) are found in plant species of Aglaonema treubii and Hyacinthus orientalis. alpha-homonojirimycin is a potent inhibitor of a range of alpha-glucosidases with IC50 values of 1 to 0.01 microM67. beta-homonojirimycin, alpha-homomannojirimycin and beta-homomannojirimycin are moderately good inhibitors of some mammalian and rice alpha-glucosidases67. C00042067 (2,5-Dideoxy-2,5-imino-D-glucitol) found in plant species of Stemona tuberosa is known as powerful glucosidase inhibitor68. Similar Isomer compound C00002037 ( (+)-2,5-Dideoxy-2,5-imino-D-mannitol ) found in bacteria Streptomyces and different plant species of Nephthytis poissoni, Adenophora spp., Campanula rotundifolia, Endospermum sp., Omphalea diandra, Derris elliptica, Lonchocarpus sericeus, Lonchocarpus spp., Hyacinthoides non-scripta, Hyacinthus orientalis and Scilla campanulata is likely to exibit similar activities.
Bicluster 2
This cluster contains six binding molecules and three secondary metabolites with I(S) = 0.3 and 0.86 ≤ T(D,M) ≤ 1 shown in Fig. 5(b). The metabolite C00029420 (1-deoxynojirimycin) is found in bacteria species of Bacillus polymyxa, Bacillus subtilis, and Streptomyces lavandulae and plant species of Adenophora triphylla, Commelina communis L, Endospermum medullosum, Omphalea queenslandiae, Hyacinthus orientalis, and Morus bombycis. This compound is an alpha-glucosidase inhibitor. In an experiment with mice, this compound shows the inhibition property of glucose absorption in intestine69. It also down-regulated intestinal SGLT1, Na+/K+-ATP, and GLUT2 mRNA and protein expression69. The activity, mRNA, and protein levels of hepatic glycolysis enzymes (GK, PFK, PK, PDE1) are increased and the expression of gluconeogenesis enzymes (PEPCK, G-6-Pase) is decreased by the pretreatment with 1-deoxynojirimycin69.
The metabolite C00002035 (1-Deoxymannojirimycin) is found in bacteria species of Agrobacterium sp., and Streptomyces lavandulae and plant species of Adenophora triphylla var.japonica, Connarus ferrugineus, Endospermum medullosum, Omphalea diandra, Albizia myriophylla, Angylocalyx spp., Derris malaccensis, Lonchocarpus costaricensis, Lonchocarpus sericeus, and Hyacinthus orientalis. 1-deoxynojirimycin is an alpha-glucosidase inhibitor. It can block HIV envelope glycoprotein-mediated membrane fusion at the CXCR4 binding step70. 1-deoxynojirimycin along with a specific inhibitor of α1,2-mannosidase and generating ‘high mannose’ type of N-glycan was used to treat the human hepatocarcinoma 7721 cells71. C00036384 (1,4-Dideoxy-1,4-imino-D-arabinitol) found in plant species of Arachniodes standishii, Angylocalyx spp., Hyacinthoides non-scripta, Morus bombycis, and Eugenia spp. It is an inhibitor of glycogen phosphorylase to inhibit glycogenolysis, in the liver and brain of various animal models72–76.
The advantage of using plant-derived compounds for therapeutic purposes is that they do not exhibit much side effects compared to synthetic chemicals. The natural metabolites filtered by our experiment are extracted mostly from plants and bacteria have shown similarity to different binding molecules and possess different medicinal properties against different diseases i.e. cancer, inflammation, cough, diabetes, and influenza. 1-Deoxymannojirimycin and its derivatives may have anti-HIV activity. The Metabolites group of Bicluster 2 is glucosidase inhibitor which is an indication of the associativity of clustering similar metabolites and biological activity. Some metabolites have properties like protein kinase inhibitors or enzyme inhibitors. Natural Protein kinase inhibitors also have numerous biological activities like modulating signaling pathways, cell proliferation, and angiogenesis. Since the natural metabolites are precursors of many kinases development, the information provided here could be helpful to discover the kinases for spike protein attachment.
The clusters of spike proteins based on structural similarity, can help us to predict the affinity of secondary metabolites for proteins for which a binding metabolite could not be found. Figure 6(b) shows spike protein Cluster 2, where three proteins are shown by dragging somewhat outside. These three proteins namely 7nxb_C, 7e8m_A and 6zfo_A have pairwise more than 90% structural similarity. We predicted five secondary metabolites that can have binding affinity to the spike proteins 7nxb_C and 7e8m_A with I(S) = 0.5, T(D,M) = 1 and therefore W(S, D, M) = 0.5 (Fig. 6(a)). It may be concluded that the protein 6zfo_A is also very likely to bind with these molecules which can be established by further experimental verification.