Discovering Natural Products as Potential Inhibitors of SARS-CoV-2 Spike Proteins

doi:10.21203/rs.3.rs-5021821/v1

Download PDF

Article

Discovering Natural Products as Potential Inhibitors of SARS-CoV-2 Spike Proteins

https://doi.org/10.21203/rs.3.rs-5021821/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

The ongoing global pandemic caused by the SARS-CoV-2 virus has demanded the urgent search for effective therapeutic interventions. In response, our research aimed at identifying natural products (NPs) with potential inhibitory effects on the entry of the SARS-CoV-2 spike (S) protein into host cells. Utilizing the Protein Data Bank Japan (PDBJ) and BindingDB databases, we isolated 204 S-glycoprotein sequences and conducted a clustering analysis to identify similarities and differences among them. We subsequently identified 33,722 binding molecules (BMs) by matching them with the sequences of 204 S-glycoproteins and compared them with 52,107 secondary metabolites (SMs) from the KNApSAcK database to identify potential inhibitors. We conducted docking and drug-likeness property analyses to identify several SMs with potential as drug candidates based on binding energy (BE), no Lipinski’s rule violation (LV), psychochemical properties within the pink area of the bioavailability radar, and a bioavailability score (BAS) not less than 0.55. Fourteen SMs were found to be effective against the three major types of spike proteins. Our study provides a foundation for further experimental validation of these compounds as potential therapeutic agents against SARS-CoV-2.

Biological sciences/Computational biology and bioinformatics

Biological sciences/Drug discovery

SARS-CoV-2

spike protein

virtual screening

natural products

drug discovery

The COVID-19 pandemic, triggered by the SARS-CoV-2 virus, has precipitated a worldwide health emergency, highlighting the imperative for efficient diagnostic methods, vaccines, and treatments. Diagnostic approaches include molecular tests (such as rapid antigen or antibody tests [1], [2], [3], immunoenzymatic serological tests [4], [5], and RT-PCR-based tests [6]) and imaging-based tests [7], [8]. Moreover, numerous preventative measures have been implemented through vaccination development [9]. In addition to these established diagnostic and preventive measures, the ongoing quest for effective treatments has led to significant studies of molecular interactions [10], [11]. This exploration has created opportunities to identify new therapeutic approaches, one of which is investigating the molecular interactions between potential therapeutic agents and viral components, such as the SARS-CoV-2 spike protein. The spike protein of SARS-CoV-2, particularly its receptor-binding domain (RBD), binds to the human angiotensin converting enzyme 2 (ACE2) receptor to initiate viral entry and is a critical molecular target for combating the virus [12], [13]. Therefore, identifying secondary metabolites that can bind to the spike protein is essential for inhibiting the entry and propagation of the virus into human cells. The ongoing search for effective treatments has also heightened interest in understanding and inhibiting the molecular pathways affected by the virus.

Natural products (NPs), which are secondary metabolites (SMs) derived from plants, fungi, and other organisms, are rich and diverse sources of bioactive molecules with antiviral properties. Owing to their structural diversity and complexity, they often display unique modes of action that can complement or enhance the efficacy of synthetic antiviral agents [14], [15], [16]. Recently, SMs have played a crucial role in drug discovery and have been studied extensively. Kaempferol [17], myricetin [18], and curcumin [19] are among the promising agents studied as SARS-CoV-2 S protein inhibitors. A range of plant secondary metabolites, including polyphenols, alkaloids, saponins, terpenes, and carbohydrates, have been shown to exert antiviral effects against SARS-CoV-2 and other viruses [20]. However, despite their potential, challenges, such as bioavailability and efficacy, still need to be addressed.

Bioinformatics involves the use of clustering techniques to group proteins based on their sequence or structural similarities [21]. This study aimed to categorize SARS-CoV-2 S proteins and determine their secondary metabolites using sequence similarity evaluations, molecular docking, and bioavailability assessments. In our study, bioavailability was evaluated not only to confirm that the proposed compounds can be absorbed and utilized by the body but also to ensure their drug-likeness. This comprehensive approach, which includes the assessment of drug-likeness, enhances the accuracy and clinical relevance of our predictions for potential COVID-19 treatment.

The flowchart shown in Fig. 1 illustrates the methods used in this research. The six major steps were as follows: (i) collection and selection of spike glycoproteins, (ii) clustering analysis of spike glycoproteins, (iii) identification of binding molecules from the BindingDB, (iv) identification of SMs from the KNApSAcK database, (v) docking of secondary metabolites to corresponding protein clusters, and (vi) bioavailability analysis of docked secondary metabolites.

i. Collection and Selection of Spike Glycoproteins

We searched the PDBJ database using the keyword ‘COVID-19’and found 1277 journals (Table S1 in Supplementary Material I), referring to 1202 FASTA files that were curated and divided into different protein sequences of SARS-CoV-2 (Table S2 in Supplementary Material I). The structural proteins of SARS-CoV-2 include a range of different proteins, the most important of which are the membrane (M) glycoprotein, the envelope (E) protein, the nucleocapsid (N) protein, and the spike (S) glycoprotein [22], [23], [24]. Spike glycoproteins on the surface of a virus are responsible for forming peplomers that facilitate the entry of the virus into host cells for their intended purposes [12], [25]. As the objective of this study was to discover natural compounds that could obstruct the entry of the SARS-CoV-2 S protein into host cells, only 204 S glycoproteins from 1202 FASTA files were considered (Table S3 in Supplementary Material I).

ii. Clustering Analysis of Spike Glycoproteins

This study employed a basic clustering algorithm from DPClusSBO [26], which is a unified software that implements the DPClusO [27], [28], [29], and BiClusO [30] algorithms. This tool provides a graphical user interface for performing simple and bipartite graph clustering, as well as filtering and merging clusters, hierarchical node analysis, and visualization of cluster sets. Using these clustering techniques, we identified potential subclasses of spike proteins that might exhibit common characteristics of varying exposure to different natural compounds. The results of cluster analysis could serve as a guide for downstream molecular docking studies and aid in the identification of potential natural compounds for targeted therapy.

iii. Binding Molecule Identification from the BindingDB

We retrieved protein sequences similar to spike glycoproteins and associated binding molecules from BindingDB, a publicly available database that stores experimental data on protein–binding molecule interactions [31]. We searched for binding molecules in the BindingDB database using 204 spike protein sequences. This search incorporates the BLASTp algorithm to identify matching protein sequences and corresponding binding molecules. Similarity between protein sequences is measured utilizing sequence identity (SI) value that quantifies the exact matches between two different sequences [32]. The sequence identity (SI) between two proteins A and B was calculated as follows:

$$\:SI=\frac{L\left(i\right)}{\text{m}\text{i}\text{n}({length}_{{S}_{A}},{length}_{{S}_{B}})}$$

where, $\:L\left(i\right)$ denotes the number of aligned residues with identical or equivalent properties, $\:{S}_{A}$ and $\:{S}_{B}$ sequences of proteins A and B. While searching the BindingDB database, we utilized 0.3 as the minimum threshold SI value. Each identified binding molecule was assigned an identifier in the form of a DrugID.

iv. Identification of Secondary Metabolites from the KNApSAcK Database

We utilized SMs from the KNApSAcK database [33], [34] owned by our laboratory, which contains a broad collection of secondary metabolite data from plants. The database is used for plant metabolomics research, and each metabolite in the database is assigned a unique Metabolite ID (MetaID) associated with the corresponding data, such as its molecular structure, physicochemical properties, and species. The KNApSAcK database contains more than 53,000 SMs derived from diverse sources, such as plants, bacteria, protozoa, fungi, and animalia.

We determined SMs similar to binding molecules using a fingerprint-based molecular similarity calculation measure known as the Tanimoto coefficient (TC). The Tanimoto coefficient is calculated as follows:

$$\:TC\left(A,B\right)=\frac{(A\cap\:B)}{(A\cup\:B)}$$

where, $\:A$ denotes the set of fingerprints of the BMs and $\:B$ is the set of fingerprints of the KNApSAcK SMs. The TC value ranges from 0 (indicating no similarity) to 1 (indicating similarity). We narrowed the selection of SMs by eliminating low-TC SMs, focusing on those with a TC value of at least 0.85.

v. Docking the Secondary Metabolites to Corresponding Protein Cluster

Prior to the docking process, the previously identified SMs were mapped to the corresponding cluster according to the result of similarity matching between binding molecule (DrugID) and metabolite (MetaID). By docking analysis, we can verify whether the identified SMs are likely to bind with the spike proteins. Therefore, to identify potential NPs that could bind strongly to SARS-CoV-2 spike proteins, selected SMs were subjected to docking analysis. Docking analysis provides valuable insights into the interactions between secondary metabolites and their targets [35]. Specifically, it provides an indication of the binding affinity between molecules, which is crucial for assessing their therapeutic potential [36]. The protein structures of each cluster were prepared by removing water molecules, ions, and small ligands and by adding hydrogen using the LePro module of the LeDock program (www.lephar.com). After the protein was sanitized, the predicted binding pocket was determined using Fpocket [37]. The SMs found in each spike protein cluster were also prepared by converting the MOL files to MOL2 files. All prepared protein structures and SMs were docked using SMINA [38], [39].

vi. Bioavailability Analysis of the Docked Secondary Metabolites

Previously analyzed SMs were assessed for their drug-like properties and physicochemical characteristics using SwissADME (www.swissadme.ch). The SMILES strings of SMs docked to each protein cluster were subjected to SwissADME analysis. Lipinski’s rules of five was used to evaluate the drug–likeness of the secondary metabolites [40]. Moreover, the bioavailability radar was also used to represent the drug-likeness properties of compounds in a rapid appraisal format, offering a quick visualization of a molecule's drug-likeness. The pink area in the bioavailability radar represents the optimal range for each property, such as lipophilicity (XLOGP3 between − 0.7 and + 5.0), size (MW between 150 and 500 g/mol), polarity (TPSA between 20 and 130 A2), solubility (log S not higher than 6), saturation (fraction of carbons in sp3 hybridization not less than 0.25), and flexibility (no more than nine rotatable bonds) [41]. The effectiveness of a substance as a drug or therapeutic agent was assessed using the bioavailability score (BAS), a numerical measure that reflects its potential for easy absorption into biological systems [42], [43]. The BAS is determined by a semi-quantitative rule-based score that considers factors such as total charge, TPSA, and violation with the Lipinski rule, which divides compounds into four classes with probabilities of 11%, 17%, 56%, or 85%. A compound should have an oral bioavailability probability of at least 10% [42]. While Lipinski's rule can be utilized to choose secondary metabolites, we also employ bioavailability radar analysis to conduct an additional screening process. This two-step screening approach maximizes the likelihood of identifying viable drug candidates. This approach guarantees that only secondary metabolites with favorable profiles across all important factors are chosen for further development.

vii. Final Drug Candidate Selection Criteria

Secondary metabolites that pass all the above analyses will be assessed using selection criteria that consider: (a) those that do not violate Lipinski's rule, (b) those with physicochemical properties within the pink area on the radar, and (c) those with high-probability BAS values. The secondary metabolites that meet these criteria will be sorted by their binding affinity values, with the top five selected for further consideration.

3.1 Cluster Analysis of Spike Glycoproteins

A method was developed to identify groups of structurally similar spike proteins utilizing sequence similarity measure. The Python Bio.pairwise2 package was used to calculate the global alignment score without including any gap penalty. Identical characters were given a score of ’1’ and all others were given a score of ’0’. The alignment score was then converted to a percentage. In this study, 204 distinct spike proteins were identified, correspond to maximum possible 20,706 pairwise comparisons ($\:{C}_{204}^{2}=20.706)$ (Table S4 in Supplementary Material I). However, after applying a strict filter (alignment score $\:<$ 50%) to exclude weak interactions, the total number of interactions decreased to 9,536 (Table S5 in Supplementary Material I). This filtering process results in a 2.17-fold reduction (calculated as $\:\frac{20.706}{9.536}$). These 9,536 are considered as the edges of a simple graph. Additionally, it is worth mentioning that within these interactions, there exists a strong subset of 8,791 relations that boast an alignment score of 80% or higher (Table S6 in Supplementary Material I).

After constructing the graph/network of the spike proteins based on sequence similarity, clusters were identified using DPClusSBO1.2 [26]. Five clusters were found as illustrated in Fig. 2. The images of cluster 1 and cluster 2 in Fig. 2 are only partially shown for ease of visualization. The complete images of the clusters 1 and 2 are shown in Figure S1 in Supplementary Material III. These five clusters did not overlap (Fig. 2), indicating that each cluster had distinct characteristics. Among these five clusters, four formed a complete graph, indicating a high degree of similarity between the FASTA sequences of the corresponding spike proteins. The sizes of the five clusters were 124, 62, 7, 5, and 3, respectively (Table SI in Supplementary Material III). A total of 95%, 83%, and 76% of the edges in the three largest clusters had alignment scores greater than 80%.

Cluster 1 contains two types of spike proteins: spike glycoprotein and collagen alpha-1(I) chain-type spike proteins. The spike glycoprotein is responsible for binding to host cells and facilitating viral entry [25], whereas the spike glycoprotein has been analyzed for its ability to bind to human proteins, including collagen alpha-1(I) chain–type spike proteins, suggesting a relationship with collagen, a structural protein present in various tissues [44]. We speculate that metabolites in this cluster could inhibit the binding of the spike protein to host receptors, interfere with interactions involving collagen, or prevent the virus from attaching to and entering host cells by binding to the active site of the spike glycoprotein or collagen-related region.

Cluster 2 consists of the S1 type spike protein, which has the primary function of attaching the virion to the host cell membrane by interacting with specific host receptors. This interaction initiates the infection process. The primary receptor that interacts with the S1 subunit is host ACE2 [45]. Once binding occurs, several outcomes can follow based on the state of the S2/S2’ subunit. If S2/S2’ is cleaved, binding to the receptor triggers direct fusion at the cell membrane, allowing viral genetic material to enter the host cell [25], [46], [47]. If S2/S2’ is not cleaved, binding to the receptor results in internalization of the virus via endocytosis, subsequently leading to fusion of the virion membrane with the host endosomal membrane [25], [46], [48]. The SMs mapped to the S1 spike protein cluster may have several potential effects. They can block or modulate the receptor-binding domain (RBD) of the S1 subunit, preventing the virus from binding to the ACE2 receptor. Alternatively, these metabolites may alter the conformational changes required for S2/S2’ cleavage or subsequent fusion processes. This interference results in reduced infectivity and hinders infection progression.

Cluster 3 consists of the S2 type spike protein, which acts as the precursor of the fusion protein and undergoes necessary processing during the biosynthesis of the spike protein and formation of the virus particle. The S2 subunit plays a crucial role in mediating the fusion of virions and cellular membranes by functioning as a class I viral fusion protein. It contains two viral fusion peptides that are exposed after cleavage. S2 or S2’ cleavage can occur either at the cell membrane owing to the action of the host TMPRSS2 [49] or during endocytosis mediated by the host CSTL [50], [51]. Upon cleavage, there is a significant and irreversible conformational alteration in the protein that leads to the fusion of the viral envelope with the cellular cytoplasmic membrane. This fusion facilitates the release of viral genomic RNA into the cytoplasm of the host cells [51]. S2 undergoes multiple conformational states during its function, including the pre–fusion native, pre–hairpin intermediate, and post–fusion hairpin states. These transitions are essential for viral fusion. As the protein transitions through these states, the coiled-coil regions, known as heptad repeats, form a specific “trimer-of-hairpin” structure. This structure brings the fusion peptide close to the C-terminal region of the ectodomain, facilitating merging of the viral and target cell membranes [52], [53]. For Cluster 3, the SMs mapped to the S2 spike protein might have influenced the various steps of this fusion process.

Cluster 4 encompasses elements of the S2 subunit of the spike protein and the S2’ cleavage site. SMs mapped to this cluster could inhibit membrane fusion (due to S2 and S2’). Cluster 4 seems to be related to the viral entry mechanism of SARS-CoV-2, specifically the S2 subunit and its activation at the S2' site. This cluster may be of particular interest for understanding the mechanisms of viral entry and for the development of antiviral strategies targeting the spike protein.

Proteins of Cluster 5 contain the S2’ site, a cleavage site that is activated by host proteases and enables the virus to enter host cells. SMs that map to this cluster may inhibit the cleavage process, which is vital for the virus to infect host cells.

3.2 Identification of Binding Molecules from BindingDB

We identified 33,672 triplet relations (among spike protein-target protein-binding molecule) from the BindingDB database by entering 204 spike proteins and utilizing 0.3 as the minimum threshold sequence identity (SI) value (Table S1 in Supplementary Material II). In this study, all identified species were incorporated into our dataset to identify BMs. The search produced a dataset of binding information between the protein sequences of distinct species and the binding drugs or small molecules. These species included Mus musculus, Homo sapiens, Mycobacterium tuberculosis, Aspergillus fumigatiaffinis, Bos taurus, Dictyostelium discoideum, Rattus norvegicus, Sus scrofa, Oryctolagus cuniculus, Bacillus cereus, Acinetobacter baumannii, and Cryptosporidium parvum. As shown in Fig. 3a, most protein sequences were obtained from H. sapiens (81.33%). These 33,672 relations are associated to 6920 unique potential BMs. For each unique binding molecule, we keep the triplet relation corresponding to highest SI for subsequent analysis. These molecules were distributed across 17 spike proteins (Fig. 3b).

3.3 Finding Secondary Metabolites from the KNApSAcK Database

The 6920 unique potential BMs identified across 17 spike proteins were compared with natural products contained in the KNApSAcK database, which encompasses 51,678 secondary metabolites (SMs) spanning various species. After this comparison process, 113 BMs–SMs interactions were found with Tanimoto coefficient (TC) scores above 0.85 (Table S3 in Supplementary Material II).

Upon identifying these interactions, we established an association (refer to Table S4 in Supplementary Material II) encompassing 9 spike proteins (pcode), 67 binding molecules or drugs (DrugID), and 111 secondary metabolites (MetaID). These 9 spike proteins can link to three of the five distinct clusters (see Fig. 2). We successfully projected 50, 13, and 48 unique SMs for the first, second, and fourth clusters, respectively (Table 1). Notably, no metabolites were detected in clusters 3 and 5. This study facilitated the identification of natural products (NPs) that might effectively counteract spike proteins.

Table 1

List of drugs and metabolites mapped to different spike protein clusters.
Cluster	Spike Type	Binding Molecule	Secondary Metabolite
1	Spike glycoprotein, Collagen alpha-1(l) chain	D10243, D10244, D10615, D10739, D10970, D11052, D11207, D11256, D11765, D11776, D11796, D13216, D13556, D13700, D13704, D13972, D13986, D13991, D13993, D14570, D15804, D15834, D15836, D15847, D15861, D16791	C00003684, C00003683, C00027845, C00031597, C00039897, C00035257, C00051813, C00031853, C00000347, C00001217, C00001240, C00001234, C00001232, C00001229, C00001235, C00001224, C00007424, C00035628, C00036669, C00050972, C00028778, C00027846, C00046254, C00001849, C00001835, C00027384, C00027245, C00038322, C00044486, C00051992, C00001737, C00000797, C00003218, C00021143, C00003211, C00003318, C00021120, C00021118, C00032703, C00011307, C00032469, C00010979, C00001736, C00001358, C00019577, C00000795, C00007287, C00001391, C00006925, C00035173
2	Spike protein S1	D10198, D11292, D11293, D12434, D12280, D12313, D12615, D12845, D14005, D14869, D14956, D15930, D16218	C00027898, C00018679, C00015601, C00027138, C00038270, C00001099, C00003923, C00015542, C00039732, C00001510, C00001492, C00042440, C00032054
3	Undefined	No data found	No data found
4	Spike protein S2, Spike protein S2’	D10253, D10399, D10444, D10494, D10626, D10627, D10631, D10747, D11193, D11194, D11208, D11213, D11782, D11783, D11800, D12311, D12428, D12740, D12837, D13227, D13707, D13982, D13983, D16291, D16690, D16743, D16771, D16790	C00043907, C00049768, C00023582, C00001661, C00039556, C00036672, C00015518, C00039557, C00034686, C00039673, C00034685, C00003375, C00020715, C00016540, C00000446, C00000218, C00017543, C00018127, C00027661, C00017790, C00017726, C00033297, C00037023, C00001884, C00001927, C00028699, C00030512, C00003633, C00003617, C00003618, C00026982, C00002035, C00029420, C00036384, C00029799, C00029792, C00029796, C00038582, C00000958, C00008866, C00008867, C00008883, C00008868, C00008882, C00008888, C00008865, C00008889, C00002766
5	Spike protein S2’	No data found	No data found

3.4 Docking Analysis of Secondary Metabolites

To assess the effectiveness and usefulness of the selected SMs as drugs, we conducted docking and bioavailability analysis utilizing the spike proteins belonging to corresponding clusters. The binding energy (BE) and bioavailability of each secondary metabolite for binding to the SARS-CoV-2 spike proteins of the corresponding cluster (clusters 1, 2, and 4) were further examined. The distribution of the BE of 50, 13, and 48 SMs respectively associated to clusters 1, 2, and 4 are shown in Fig. 4.

We represented the BE distribution of SMs in the context of cluster 1, 2, and 4 respectively in Fig. 4(a), (b) and (c). Regarding the distribution of Fig. 4(a), the majority of SMs displayed moderate binding affinities ranging from − 13 to − 5 kilocalories per mole (kcal/mol), with 15 SMs falling within the range of − 9 to − 7 kcal/mol. These findings suggested that these metabolites might interact with their targets with moderate strength, likely reflecting a commonality in their interaction mechanisms or biological functions within this cluster. On the other hand, there are fewer SMs with extremely high binding affinities, in the most negative energy bracket, from − 19 to − 13 kcal/mol. However, a subset of five SMs falling within the energy range of − 19 to − 17 kcal/mol shows exceptionally strong binding tendencies, highlighting them as potential candidates for advanced investigation owing to their probable effective interactions with the target protein(s) in Cluster 1. The two protein structures taken from the Protein Data Bank (PDB) for the binding process in this cluster are structures with PDB IDs of 6ZP2 and 7E7B. Only chainA of these two proteins is taken based on the results of collecting spike protein data (at the beginning) from the PDBJ database. The 7E7B chainA structure represents the receptor-binding domain (RBD) of the spike protein. Secondary metabolites binding to this chain could inhibit the spike protein's ability to bind to the ACE2 receptor on host cells. This inhibition can prevent the virus from attaching and entering the host cells, thereby blocking infection at the initial stage. Furthermore, 6ZP2 chainA represents a part of the trimeric structure of the spike protein. Metabolites binding to this chain might stabilize the spike protein in its prefusion state, preventing the conformational changes required for membrane fusion and viral entry into host cells.

Turning to Cluster 2 (Fig. 4b), the histogram illustrates the distinctive distribution of secondary metabolite binding energies, where a pronounced peak is observed at the lower end of the binding energy spectrum, particularly within the range of − 10.5 to − 4.7 kcal/mol. This peak included most of the metabolites (10 SMs) corresponding to this cluster, suggesting a strong affinity for their biological targets and potentially greater biological activity. In contrast, the histogram also reveals a tail extending into the positive binding energy values, with a sparse number of metabolites (3 SMs in total) exhibiting much weaker affinities, so we exclude these metabolites from further analysis. Four proteins of cluster 2 were utilized for docking analysis. There are structures of the spike protein complexed with an ACE2 receptor mimic (PDB ID: 6ZFO chainA), structure of the spike protein bound to a neutralizing antibody (PDB ID: 7NXA chain C), structure of spike protein in a partially open conformation (PDB ID: 6LZG chainB), and structure of spike protein in complex with another binding partner (PDB ID: 7E88 chainL). The metabolites interacting with 6ZFO chainA could block the interaction of the spike protein with ACE2 mimics, potentially inhibiting viral entry. Moreover, metabolites bound to 7NXA chainC could potentially prevent the spike protein from effectively binding to host receptors as this structure is part of a complex with a neutralizing antibody. The three metabolites that bind to 6LZG and 7E88 showed positive binding values indicating insufficient inhibitory potential.

In Cluster 4 (Fig. 4c), the histogram indicates a significant concentration of secondary metabolites (20 SMs) with binding energies of approximately − 9.25 to − 7.33 kcal/mol, suggesting a common binding affinity, which may imply similar modes of interaction with their biological targets. The presence of fewer metabolites at both the higher and lower ends of the energy scale resembles normal distribution. The metabolites with binding energy in range of -13.08 to -15.00 kcal/mol have the strongest binding affinity with the spike proteins belonging to cluster 4. Notably, the metabolites with binding energies closer to − 3.50 kcal/mol might represent weaker interactions or a regulatory role within their respective pathways. This distribution of binding energies points to a nuanced landscape of metabolite-target interactions within Cluster 4. This variability in binding affinities can guide further in silico analyses and experimental validations to elucidate the mechanisms of action and therapeutic potential of these metabolites. For cluster 4, three protein structures from SARS-CoV-2 were utilized for the docking analysis of spike protein cluster 4. The structure of the spike protein in its full-length form, including both the S1 and S2 subunits (PDB ID: 6M1V chainA), the structure of the spike protein in a closed conformation (PDB ID: 6LXT chainA), and the structure of the spike protein in an open conformation (PDB ID: 7C53 chainA). The SMs that bind to 6M1V and 6LXT may stabilize these structures in inactive conformations, potentially preventing the spike protein from undergoing the necessary changes for host cell entry. Meanwhile, the SMs binding to 7C53 could interfere with its open conformation, thus inhibiting receptor interaction and viral fusion.

3.5 Bioavailability Analysis of Secondary Metabolites

All SMs that showed negative binding energy (i.e., good binding affinity) during docking analysis were further analyzed for drug-like and physicochemical properties and bioavailability. For this purpose, we used SwissADME (www.swissadme.ch), a web tool to evaluate the pharmacokinetics, drug-likeness, and bioavailability of SMs. The SMILES strings of the SMs were subjected to SwissADME. From the SwissADME results, which included the bioavailability radar, we evaluated the SMs and selected those that did not violate Lipinski's rule and whose bioavailability radar properties were within the desirable pink area. All the results of the docking and SwissADME analysis are shown by Table S5 in Supplementary Material II. A discussion of bioavailability analysis is presented separately for each cluster as follows:

3.5.1 Secondary Metabolites for Cluster 1

Of the 50 secondary metabolites (SMs) corresponding to cluster 1, one compound (C00039897) was excluded from SwissADME owing to its long SMILES string. A total of 49 SMs were evaluated using the selection criteria. Linoleic acid; (Z, Z)-9,12-Octadecadienoic acid (C00001224) was one of the compounds with a BAS value of 0.85 and exhibited the best binding affinity (-8.99523 kcal/mol) in its BAS group. We provide Fig. 5 to visualize the results of the overall analysis, which consists of binding mechanism of C00001224 to the active site of the protein, as well as its interactions and its bioavailability radar analysis.

Figure 5A provides an overview of the binding mechanism of C00001224 to the 7E7B chainA protein complex. Figure 5B provides a closer perspective on this interaction. From this binding mechanism, an interaction map (Fig. 5C) reveals hydrophobic interactions (alkyl and pi-alkyl) that enhance the specificity of binding, ensuring that the ligand fits well within the active site. Despite having a high BAS score, the bioavailability radar analysis results in Fig. 5D indicate that some radar properties, such as flexibility (FLEX) and lipophilicity (LIPO), fall outside the pink area. Additionally, this metabolite violates one of Lipinski's rules. Although C00001224 showed good docking results, its evaluation based on drug-like properties and physicochemical characteristics suggests that it cannot be selected. We performed the same analysis on the SMs in this cluster to select the SMs that fall into the selection category.

None of the SMs in this cluster had positive binding energy values. Only previously mentioned SM (C00039897) excluded from Fig. 6. As indicated in Fig. 6, the BAS values for the SMs ranged from the lowest (0.17) to the highest (0.85) probability of oral bioavailability, with the majority falling in the 0.55 category. Among the 29 SMs in the BAS 0.55 category, 23 did not violate Lipinski's rule. However, only 16 of these SMs had both Lipinski's rule-compliant properties and radar properties within the pink area. In the BAS 0.85, 0.56, and 0.17 categories, there were SMs that only met either the Lipinski rule or the radar properties but not both. Therefore, we selected the top five SMs from the 16 SMs in BAS 0.55, which met both criteria and ranked them based on the best binding affinity values. These five SMs are presented in Table 2 and were further analyzed for their therapeutic effect.

Table 2

Five tops selected SMs for Cluster 1.
Secondary Metabolite (SM)	Name	BE	MW	BAS	Radar
C00001835	Cephaeline	-15.5038	466.61	0.55	◯
C00001849	Emetine	-15.1207	480.64	0.55	◯
C00032469	Uzarigenin	-15.0216	374.51	0.55	◯
C00051992	9-O-Demethylcephaeline	-11.79079	452.59	0.55	◯
C00003318	Linifolin A	-11.58786	304.34	0.55	◯

Cephaeline (C000111307) is an inductor of histone H3 acetylation and is correlated with anticancer effects in mucoepidermoid carcinoma (MEC) cell lines, as proven by Silva et al. [54]. The induction of Cephaeline, which increased levels of H3K9ac in MEC cell lines, reduces the viability and migratory potential of MEC cells, as well as disruption of tumorsphere formation, suggesting a role in modulating cancer cell behavior.

Emetine (C00001849) exhibits a multifaceted therapeutic effect across various medical conditions. This is primarily because of its inhibitory action on protein biosynthesis in HeLa cells, which provides a biochemical basis for its toxic and therapeutic properties [55]. Additionally, emetine has been shown to have potential as an antiviral agent against HIV by blocking the reverse transcription process [56], and as an anti-malarial agent by inhibiting protein translation in Plasmodium falciparum [57]. Furthermore, emetine demonstrates anticancer properties, as evidenced by its ability to reduce proliferation and induce apoptosis in osteosarcoma cells [58] and to inhibit the proliferation of pulmonary artery smooth muscle cells in pulmonary arterial hypertension [59]. However, the irreversible nature of its effects on protein and DNA synthesis as well as its impact on behavior and cardiac function highlight the need for caution in its clinical application [55], [60].

Uzarigenin (C00032469) is a carnolide that can be isolated from plants such as Pergularia tomentosa and the root of Calotropis gigantea. Extracts containing uzarigenin from Xysmalobium undulatum (uzara root) are used traditionally to treat diarrhea. Clinical studies have shown that these extracts do not significantly alter cardiovascular parameters, making them safe for managing gastrointestinal issues [61].

9-O-Demethylcephaeline (C00051992) is a derivative of Cephaeline, has been found to reduce the viability of MEC cells. It also halts tumor growth and cellular migration, induces histone H3 acetylation, and disrupts tumorsphere formation. This suggests its potential to inhibit cancer stem cells and reduce tumor proliferation [54].

Linifolin A (C00003318) is a naturally occurring flavonoid found in plants such as Helenium donianum and Helenium aromaticum. Linifolin A has been shown to induce apoptosis in various cancer cell lines by affecting key signaling pathways that control cell growth and death. Thus, it is a promising candidate for cancer therapy [62]. It also demonstrated significant anti-inflammatory properties by reducing the levels of pro-inflammatory cytokines. This suggests their potential use in the treatment of inflammatory diseases [63].

The results of the analysis of the therapeutic effects for other SMs that fall into the selection category are presented in Table SII in Supplementary Material III.

3.4.2 Secondary Metabolites for Cluster 2

From the docking analysis, we selected 10 SMs for the proteins in cluster 2 based on the negative binding energy. The aim of this evaluation was to assess whether any of the ten SMs met the selection criteria. A metabolite named 5,6,2',6'-Tetrahydroxy-7,8-dimethoxyflavone (C00003923) had the highest binding energy value (-10.189 kcal/mol) in the top BAS category (0.55). We present Fig. 7 to offer a visual representation of the comprehensive analysis's outcomes, including the binding process of C00003923 to the protein's active site, as well as its interactions and the bioavailability radar analysis results.

As we can see in Fig. 7A, the 3D structure reveals the spatial arrangement of the protein-ligand complex, showcasing how the ligand C00003923 binds within the active site of the 7NXA chain C protein. The binding conformation is depicted, highlighting the interaction dynamics between the ligand and the surrounding amino acid residues. Moreover, Fig. 7B shows a detailed zoom of the protein-ligand binding sites. The 2D interaction map shown in Fig. 7C describes specific interactions between ligand C00003923 and key amino acid residues within the binding site of the 7NXA chain C protein. Notable interactions include conventional hydrogen bonds (highlighted in green) and π-alkyl interactions (highlighted in purple). Key amino acid residues such as VAL A:367, LEU A:368, CYS A:336, and GLY A:339 are involved in these interactions, stabilizing the ligand within the binding pocket. The bioavailability radar plot (Fig. 7D) revealed the pharmacokinetic properties of the ligand C00003923. Despite its favorable binding energy and balanced parameters, the INSATU value was outside the optimal range, indicating that C00003923 could not be selected. The same analysis was conducted for the other SMs to select those that fell into the selection category.

As depicted in Fig. 8, two distinct BAS values were discovered within this cluster, specifically 0.17 and 0.55. A total of eight small molecules (SMs) were classified under the 0.55 value. All these 8 structures satisfied Lipinski's rules. However, only about half of the structures in the 0.55 BAS category met both Lipinski's criteria and had bioavailability radar properties within the desired area (pink region). Those half SMs sorted based on their binding energy and proceeded to analyze for its therapeutic effects. These half SMs are detailed in Table 3 below.

Table 3

Top selected SMs for Cluster 2.
Secondary Metabolite (SM)	Name	BE	MW	BAS	Radar
C00001492	Caffeine; Caffein	-8.33392	194.19	0.55	◯
C00027138	Colchamine; Demecolcine	-7.7202	371.43	0.55	◯
C00042440	Cytidine	-6.95777	243.22	0.55	◯
C00001510	Theophylline	-5.89931	180.16	0.55	◯

Caffeine; Caffein (C00001492) is the selected secondary metabolite and has the best binding affinity among the other 3 selected SMs. This secondary metabolite is a psychoactive substance often widely consumed and has several properties for various medical conditions such as neurodegenerative diseases and cancer. The neuroprotective effects of caffeine in PD are mediated through antagonism of adenosine receptors, particularly A2A receptors, which helps in reducing motor symptoms and possibly slowing disease progression [64]. It also enhances the anti-tumor immune response by decreasing PD-1 expression on infiltrating cytotoxic T lymphocytes and increasing the levels of pro-inflammatory cytokines, such as TNF-α and IFN-γ[65].

Colchamine, also known as demecolcine (C00027138), is a derivative of colchicine, which has been demonstrated to improve cancer radiotherapy outcomes. Nomura et al. found that it impedes cell division during metaphase, thereby enhancing the vulnerability of cancer cells to radiation-induced harm [66]. To date, there have been no new discoveries in the 21st century regarding the efficacy of these secondary metabolites.

Cytidine (C00042440) is a pyrimidine nucleoside in which cytosine is attached to ribofuranose via a beta-N(1)-glycosidic bond. Fifteen years ago, researchers discovered the potential of Cytidine as a mood stabilizer, specifically for treating bipolar depression, by lowering brain glutamate/glutamine levels, which correlated with improvements in depression symptoms [67]. In addition, this secondary metabolite was also shown to alleviate dyslipidemia and improve hepatic steatosis in ob/ob mice by modulating the gut microbiota composition [68].

Theophylline, a derivative of methylxanthine, is primarily utilized for its bronchodilator properties in treating respiratory illnesses like asthma and chronic obstructive pulmonary disease (COPD). Studies have shown that it can reduce inflammation in COPD patients by enhancing the effects of inhaled corticosteroids, preventing the reduction of histone deacetylase 2 (HDAC2) expression, and inhibiting the PI3K/Akt pathway, which is critical for corticosteroid sensitivity [69], [70]. By doing so, it decreases the frequency of exacerbations and hospital admissions for high-risk COPD patients. Furthermore, theophylline has been investigated for its potential renal-protective effects in neonates undergoing therapeutic hypothermia for Hypoxic-Ischemic Encephalopathy (HIE). It has been shown to improve renal perfusion and has beneficial pharmacokinetics in this vulnerable population [71].

The complete outcomes of the analysis of therapeutic effects for the other selected SMs are presented in Table SII of Supplementary Material III.

3.4.3 Secondary Metabolites for Cluster 4

In this step, a comprehensive analysis was conducted on the 48 SMs. However, seven SMs were not included in the SwissADME evaluation owing to the length of their SMILES strings. The seven excluded SMs were C00003617, C00003618, C00016540, C00017726, C00029420, C00034685, and C00039557. The remaining 41 SMs were assessed using the selection criteria. The metabolite Indomethacin (C00030512) had the highest binding energy value (-7.63338 kcal/mol) in the BAS 0.85. We present Fig. 9 to offer a visual representation of the comprehensive analysis's outcomes, including the binding process of C00030512 to the protein's active site, as well as its interactions and the bioavailability radar analysis results.

The binding mechanism of C00030512 to 7C53 chain A protein complex is clearly shown through 3D visualization (Fig. 9A). Figure 9B provides a closer perspective on this interaction. The 2D interaction map (Fig. 9C) provides a detailed schematic of the specific interactions between ligand C00030512 and key amino acid residues within the binding site of the 7C53 protein. Notable interactions include pi-stacked interactions (highlighted in purple) and conventional hydrogen bonds (green). Key amino acid residues such as GLN A:954, ASN A:953, GLN A:957, and PHE A:1009 are involved in these interactions, stabilizing the ligand within the binding pocket. The radar plot depicted in Fig. 9D shows the pharmacokinetic characteristics of the ligand C00030512. Although this secondary metabolite boasts a favorable binding energy and high bioavailability score, it displays an INSATU value that falls outside the preferred range. This disparity in the insaturation (INSATU) property implies that despite having a good binding energy, its bioavailability profile is not entirely satisfactory. Consequently, C00030512 cannot be chosen for further analysis due to its unsatisfactory INSATU value. The same analysis was performed for other SMs to identify those that met the selection criteria.

As depicted in Fig. 10, the distribution of secondary metabolites (SMs) by Bioavailability Score (BAS) exhibits diverse levels of compliance with pharmacokinetic properties and Lipinski's rule of five. The BAS categories observed were 0.17, 0.55, 0.56, and 0.85, with the 0.55 category containing the greatest population of SMs. Specifically, while there are 24 SMs in the 0.55 BAS category, only 5 SMs satisfy Lipinski’s rule and the desired radar properties. Additionally, the 0.85 BAS category contains 3 SMs, but 2 out of 3 SMs fulfill both Lipinski’s rule and the desired radar properties. We have identified two and five eligible SMs from 0.85 and 0.55 BAS, respectively. Among these SMs, we have chosen the top 5, which are arranged in descending order based on their BAS scores and docking results. These top 5 SMs are showcased in Table 4 presented below.

Table 4

Top selected SMs for Cluster 4.
Secondary Metabolite (SM)	Name	BE	MW	BAS	Radar
C00000446	(+)-Epijasmonic acid; 7-iso-Jasmonic acid	-5.02368	210.27	0.85	◯
C00000218	(-)-Jasmonic acid; Jasmonic acid; trans-Jasmonic acid; 3-Oxo-2-(2Z-pentenyl)cyclotentylacetic acid	-5.02192	210.27	0.85	◯
C00027661	11-Hydroxyvittatine	-9.11643	482.53	0.55	◯
C00018127	(+)-Staurosporine; Staurosporine; AM-2282; Antibiotic 230; Antibiotic AM 2282; CGP 39360; Staurosporin; Staurosporine	-7.91595	466.53	0.55	◯
C00023582	Paxilline	-8.04875	435.56	0.55	◯

The (+)-Epijasmonic acid and 7-iso-Jasmonic acid (C00000446) are derivatives of jasmonic acid. Jasmonic acid and its derivatives, including (+)-Epijasmonic acid, exhibit broad-spectrum activities such as anti-cancer, anti-inflammatory, and cosmetic effects. Their structural similarity to prostaglandins suggests they could be used as natural therapeutics for inflammation. Jasmonates, including 7-iso-Jasmonic acid, have shown potential as anticancer agents, either alone or in combination with other chemotherapeutic agents [72].

The (-)-Jasmonic acid; Jasmonic acid; trans-Jasmonic acid; 3-Oxo-2-(2Z-pentenyl) cyclotentylacetic acid (C00000218) also discussed in the previous study [72] exhibit significant therapeutic potential in cancer therapy. Jasmonic acid and its derivatives, including methyl jasmonate, induce apoptosis in cancer cells by altering cellular ATP levels, inducing re-differentiation through MAPK pathways, and promoting apoptosis via reactive oxygen species (ROS).

11-Hydroxyvittatine is a compound found in certain plants, particularly those belonging to the Amaryllidaceae family [73]. Researchers isolated this compound and evaluated its in vitro anti-inflammatory activity by monitoring the inhibition of lipopolysaccharide (LPS)-induced nitric oxide (NO) production in RAW264.7 mouse macrophages. The results revealed that 11-Hydroxyvittatine exhibited potent anti-inflammatory activity, with IC50 values of 5.6 µM. Additionally, the compound did not show any general cytotoxicity against the RAW264.7 cells [74].

Staurosporine (C00018127), a potent protein kinase inhibitor and organic heterooctacyclic compound, has been studied for its therapeutic effects, particularly in cancer treatment, antifungal, and antiangiogenic applications. Recent research has focused on its mechanisms and applications in various therapeutic contexts. Staurosporine has been found to induce apoptosis in pancreatic carcinoma cells, such as PaTu 8988t and Panc-1, by activating caspase-9 and significantly increasing apoptosis rates through the intrinsic signaling pathway [75]. Additionally, it affects single and collective cell migration in breast carcinoma cells, such as MDA-MB-231, MCF-7, and SK-BR-3, by inducing distinct migration patterns and reducing the cell migration velocity [76]. Furthermore, Staurosporine isolated from Streptomyces sp. BV410 has demonstrated antifungal and antiangiogenic properties, with significant production optimization for biotechnological applications [77].

Paxilline (C00023582) is a mycotoxin from the fungus Penicillium paxili. Recent research has focused on its potential therapeutic effects, particularly in neuroprotection and cancer treatment. Paxilline protects neuronal HT22 cells from glutamate-induced cell death independent of BK(Ca) channel activity and oxidative stress, suggesting its potential therapeutic value in neuroprotection [78]. This compound also effectively sensitizes glioma cells to tumor necrosis factor-related apoptosis-inducing ligand (TRAIL)-mediated apoptosis, offering a potentially safe treatment strategy for resistant gliomas [79]. Moreover, a study has shown protective effects against cadmium-induced cytotoxicity in rat pheochromocytoma PC12 and ascites hepatoma AS-30D cells. It reduces cell necrosis and intracellular reactive oxygen species (ROS) production, indicating its potential for mitigating heavy metal toxicity [80].

The results of the analysis of the therapeutic effects of other secondary metabolites that fall into the selection category are presented in Table SII in Supplementary Material III.

In this study, we identified five distinct clusters of SARS-CoV-2 spike proteins. We subjected these proteins to the BindingDB database and identified the potential binding molecules using the KNApSAcK database. We identified structurally similar secondary metabolites that were likely to bind spike proteins. We found secondary metabolites in three of these clusters that may serve as potential inhibitors of each protein within the corresponding cluster. Subsequently, we conducted docking, drug-like properties, physicochemical, and bioavailability analyses on each SMs within its respective cluster. After evaluating the results, we selected the top 5, 4, and 5 SMs from Cluster 1, 2, and 4, respectively. To determine the suitability of these SMs, we applied Lipinski's rules and sorted them based on their binding energy and bioavailability radar properties, with those in the pink area indicating optimal results. We explained all the top-selected SMs for their therapeutic effects in scientific journals. Overall, the selected SMS have therapeutic effects such as modulating cancer cell behavior, antiviral agents for HIV, curing diarrhea, inhibiting cancer stem cells, anti-inflammatories, preventing neurodegenerative diseases, anti-fungal, mitigating heavy metal toxicity, and assisting in some disease therapy.

While the research conducted thus far has been limited to computational models and analysis, it is crucial to note that these results must be confirmed through empirical data obtained from laboratory experiments. Collaborating with biologists is essential for carrying out necessary wet–lab studies, such as in vitro and in vivo studies, to provide physical evidence supporting or contradicting our computational findings. Only by combining in silico and in vivo/in vitro approaches we can fully validate our findings and gain confidence in their relevance and applicability in combating COVID-19.

Conflict of interest.

The authors declare that they have no competing interests.

Funding.

Author MA was supported by 2023 KDDI Foundation International Students Scholarship (Period of April to September).

Author Contribution

MA wrote the paper, implemented the methods, and conducted the analysis with assistance from MBK, AKN, and RMI; RS, NO, MA-U-A, and SK contributed to the paper and provided guidance. MHS performed data preparation and curation. All authors read and approved the final manuscript.

Acknowledgement

Data Availability

All data supporting the findings of this study are available within the paper and its Supplementary Materials. All protein related data used in the research and generated protein-protein network are provided in Supplementary Material 1. The discovery result of secondary metabolites from protein-binding molecules-secondary metabolites analysis and the bioavailability analysis of the discovered secondary metabolites are provided in Supplementary Material 2. The complete graph of protein-protein network relation, protein clusters information, bioavailability radar of all discovered secondary metabolites in its corresponding cluster, and bioactivity properties of investigated compounds (secondary metabolites) are provided in Supplementary Material 3, along with original reference describing the therapeutic effect of investigated compounds.

Jacofsky, D., Jacofsky, E. M. & Jacofsky, M. Understanding Antibody Testing for COVID-19. J. Arthroplast. 35 (7), S74–S81. 10.1016/j.arth.2020.04.055 (Jul. 2020).
Kopel, J., Goyal, H. & Perisetti, A. Antibody tests for COVID-19, Baylor University Medical Center Proceedings, vol. 34, no. 1, pp. 63–72, Jan. doi: (2021). 10.1080/08998280.2020.1829261
Adams, E. R. et al. Antibody testing for COVID-19: A report from the National COVID Scientific Advisory Panel. Wellcome Open. Res. 5, 139. 10.12688/wellcomeopenres.15927.1 (Jun. 2020).
Kubina, R. & Dziedzic, A. Molecular and Serological Tests for COVID-19. A Comparative Review of SARS-CoV-2 Coronavirus Laboratory and Point-of-Care Diagnostics, Diagnostics, vol. 10, no. 6, p. 434, Jun. doi: (2020). 10.3390/diagnostics10060434
Rastawicki, W. & Rokosz-Chudziak, N. Characteristics and assessment of the usefulness of serological tests in the diagnostic of infections caused by coronavirus SARS-CoV-2 on the basis of available manufacturer’s data and literature review. Przegl Epidemiol. 74 (1), 49–68. 10.32394/pe.74.11 (May 2020).
Van Kasteren, P. B. et al. Comparison of seven commercial RT-PCR diagnostic kits for COVID-19. J. Clin. Virol. 128, 104412. 10.1016/j.jcv.2020.104412 (Jul. 2020).
Supriyanti, R., Alqaaf, M., Ramadhani, Y. & Widodo, H. B. Morphological characteristics of X-ray thorax images of COVID-19 patients using the Bradley thresholding segmentation, IJEECS, vol. 24, no. 2, p. 1074, Nov. doi: (2021). 10.11591/ijeecs.v24.i2.pp1074-1083
Islam, N. et al. Thoracic imaging tests for the diagnosis of COVID-19, Cochrane Database of Systematic Reviews, vol. no. 3, Mar. 2021, doi: (2021). 10.1002/14651858.CD013639.pub4
Li, Y. D. et al. Coronavirus vaccine development: from SARS and MERS to COVID-19, J Biomed Sci, vol. 27, no. 1, p. 104, Dec. doi: (2020). 10.1186/s12929-020-00695-2
Akaji, K. & Konno, H. Design and Evaluation of Anti-SARS-Coronavirus Agents Based on Molecular Interactions with the Viral Protease. Molecules. 25, 3920. 10.3390/molecules25173920 (Aug. 2020).
Onawole, A. T., Sulaiman, K. O., Kolapo, T. U., Akinde, F. O. & Adegoke, R. O. COVID-19: CADD to the rescue. Virus Res. 285, 198022. 10.1016/j.virusres.2020.198022 (Aug. 2020).
Lan, J. et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 581 (7807), 215–220. 10.1038/s41586-020-2180-5 (May 2020).
Krammer, F. SARS-CoV-2 vaccines in development. Nature. 586 (7830), 516–527. 10.1038/s41586-020-2798-3 (Oct. 2020).
Christy, M. P., Uekusa, Y., Gerwick, L. & Gerwick, W. H. Natural Products with Potential to Treat RNA Virus Pathogens Including SARS-CoV-2. J. Nat. Prod. 84 (1), 161–182. 10.1021/acs.jnatprod.0c00968 (Jan. 2021).
Thomas, E. et al. Plant-Based Natural Products and Extracts: Potential Source to Develop New Antiviral Drug Candidates. Molecules. 26 (20), 6197. 10.3390/molecules26206197 (Oct. 2021).
Raimundo, J. P. et al. Sep., Natural Products as Potential Agents against SARS-CoV and SARSCoV- 2, CMC, vol. 28, no. 27, pp. 5498–5526, doi: (2021). 10.2174/0929867328666210125113938
Gao, J. et al. Kaempferol inhibits SARS-CoV-2 invasion by impairing heptad repeats-mediated viral fusion. Phytomedicine. 118, 154942. 10.1016/j.phymed.2023.154942 (Sep. 2023).
Pan, H. et al. Myricetin possesses the potency against SARS-CoV-2 infection through blocking viral-entry facilitators and suppressing inflammation in rats and mice. Phytomedicine. 116, 154858. 10.1016/j.phymed.2023.154858 (Jul. 2023).
Nag, A., Banerjee, R., Paul, S. & Kundu, R. Curcumin inhibits spike protein of new SARS-CoV-2 variant of concern (VOC) Omicron, an in silico study. Comput. Biol. Med. 146, 105552. 10.1016/j.compbiomed.2022.105552 (Jul. 2022).
Bhuiyan, F. R., Howlader, S., Raihan, T. & Hasan, M. Plants Metabolites: Possibility of Natural Therapeutics Against the COVID-19 Pandemic. Front. Med. 7, 444. 10.3389/fmed.2020.00444 (Aug. 2020).
Bhowmick, S. S. & Seah, B. S. Clustering and Summarizing Protein-Protein Interaction Networks: A Survey, IEEE Trans. Knowl. Data Eng., vol. 28, no. 3, pp. 638–658, Mar. doi: (2016). 10.1109/TKDE.2015.2492559
Al-Qaaneh, A. M. et al. Genome composition and genetic characterization of SARS-CoV-2, Saudi Journal of Biological Sciences, vol. 28, no. 3, pp. 1978–1989, Mar. doi: (2021). 10.1016/j.sjbs.2020.12.053
Chan, J. F. W. et al. Jan., Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan, Emerging Microbes & Infections, vol. 9, no. 1, pp. 221–236, doi: (2020). 10.1080/22221751.2020.1719902
Pillay, T. S. Gene of the month: the 2019-nCoV/SARS-CoV-2 novel coronavirus spike protein, J Clin Pathol, vol. 73, no. 7, pp. 366–369, Jul. doi: (2020). 10.1136/jclinpath-2020-206658
Jackson, C. B., Farzan, M., Chen, B. & Choe, H. Mechanisms of SARS-CoV-2 entry into cells. Nat. Rev. Mol. Cell. Biol. 23 (1), 3–20. 10.1038/s41580-021-00418-x (Jan. 2022).
Karim, M. B., Kanaya, S., Md, Altaf-Ul-Amin & DPClusSBO: An integrated software for clustering of simple and bipartite graphs. SoftwareX. 16, 100821. 10.1016/j.softx.2021.100821 (Dec. 2021).
Altaf-Ul-Amin, M., Shinbo, Y., Mihara, K., Kurokawa, K. & Kanaya, S. Development and implementation of an algorithm for detection of protein complexes in large interaction networks, BMC Bioinformatics, vol. 7, no. 1, p. 207, Dec. doi: (2006). 10.1186/1471-2105-7-207
Md. Altaf-Ul-Amin, H. et al. A density-periphery based graph clustering software developed for detection of protein complexes in interaction networks, in International Conference on Information and Communication Technology, Dhaka, Bangladesh: IEEE, Mar. 2007, pp. 37–42. doi: (2007). 10.1109/ICICT.2007.375338
Md. Altaf-Ul-Amin, M., Wada & Kanaya, S. Partitioning a PPI Network into Overlapping Modules Constrained by High-Density and Periphery Tracking, ISRN Biomathematics, vol. pp. 1–11, May 2012, doi: (2012). 10.5402/2012/726429
Karim, M. B., Kanaya, S. & Amin, M. A. U. Comparison of BiClusO with Five Different Biclustering Algorithms Using Biological and Synthetic Data, in Complex Networks and Their Applications VII, vol. 813, L. M. Aiello, C. Cherifi, H. Cherifi, R. Lambiotte, P. Lió, and L. M. Rocha, Eds., in Studies in Computational Intelligence, vol. 813., Cham: Springer International Publishing, pp. 575–585. doi: (2019). 10.1007/978-3-030-05414-4_46
Gilson, M. K. et al. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045–D1053. 10.1093/nar/gkv1072 (Jan. 2016). no. D1.
Hendrix, D. A. Sequence Alignments, in Applied Bioinformatics, 1st ed., Oregon State University, pp. 34–43. [Online]. Available: https://open.oregonstate.education/appliedbioinformatics/
Afendi, F. M. et al. KNApSAcK Family Databases: Integrated Metabolite–Plant Species Databases for Multifaceted Plant Research. Plant Cell Physiol. 53 (2), e1–e1. 10.1093/pcp/pcr165 (Feb. 2012).
Nakamura, Y. et al. KNApSAcK Metabolite Activity Database for Retrieving the Relationships Between Metabolites and Biological Activities. Plant Cell Physiol. 55 (1), e7–e7. 10.1093/pcp/pct176 (Jan. 2014).
Xu, T., Zhao, H., Wang, M., Chow, A. & Fang, M. Metabolomics and In Silico Docking-Directed Discovery of Small-Molecule Enzyme Targets, Anal. Chem., vol. 93, no. 6, pp. 3072–3081, Feb. doi: (2021). 10.1021/acs.analchem.0c03684
Pantsar, T. & Poso, A. Binding Affinity via Docking: Fact and Fiction, Molecules, vol. 23, no. 8, p. 1899, Jul. doi: (2018). 10.3390/molecules23081899
Le Guilloux, V., Schmidtke, P. & Tuffery, P. Fpocket: An open source platform for ligand pocket detection. BMC Bioinform. 10 (1), 168. 10.1186/1471-2105-10-168 (Dec. 2009).
Koes, D. R., Baumgartner, M. P. & Camacho, C. J. Lessons Learned in Empirical Scoring with smina from the CSAR 2011 Benchmarking Exercise. J. Chem. Inf. Model. 53 (8), 1893–1904. 10.1021/ci300604z (Aug. 2013).
Masters, L., Eagon, S. & Heying, M. Evaluation of consensus scoring methods for AutoDock Vina, smina and idock. J. Mol. Graph. Model. 96, 107532. 10.1016/j.jmgm.2020.107532 (May 2020).
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings 1PII of original article: S0169-409X(96)00423-1. The article was originally published in Advanced Drug Delivery Reviews 23 3–25. 1, Advanced Drug Delivery Reviews, vol. 46, no. 1–3, pp. 3–26, Mar. 2001, doi: (1997). 10.1016/S0169-409X(00)00129-0
Daina, A., Michielin, O. & Zoete, V. SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci. Rep. 7 (1), 42717. 10.1038/srep42717 (Mar. 2017).
Martin, Y. C. & Score, A. B. J. Med. Chem., 48, 9, 3164–3170, doi: 10.1021/jm0492002.May (2005).
Lenin, S., Sujatha, R. & Palanisamy, S. Pharmacological Properties And Bioavailability Studies Of 3-Methyl Quinoline. Int. J. Pharma Bio Sci. 12 (1), 100–104. 10.22376/ijpbs/lpr.2022.12.1.L100-104 (Jan. 2022).
Kanduc, D. Thromboses and Hemostasis Disorders Associated with COVID-19: The Possible Causal Role of Cross-Reactivity and Immunological Imprinting, Glob Med Genet, vol. 08, no. 04, pp. 162–170, Dec. doi: (2021). 10.1055/s-0041-1731068
Yang, J. Y., Ma, Y. X., Liu, Y., Peng, X. J. & Chen, X. Z. A Comprehensive Review of Natural Flavonoids with Anti-SARS-CoV-2 Activity. Molecules. 28 (6), 2735. 10.3390/molecules28062735 (Mar. 2023).
Yu, S. et al. Jan., SARS-CoV-2 spike engagement of ACE2 primes S2′ site cleavage and fusion initiation, Proc. Natl. Acad. Sci. U.S.A., vol. 119, no. 1, p. e2111199119, doi: (2022). 10.1073/pnas.2111199119
Tang, T. et al. Proteolytic Activation of SARS-CoV-2 Spike at the S1/S2 Boundary: Potential Role of Proteases beyond Furin, ACS Infect. Dis., vol. 7, no. 2, pp. 264–272, Feb. doi: (2021). 10.1021/acsinfecdis.0c00701
Lavie, M., Dubuisson, J. & Belouzard, S. SARS-CoV-2 Spike Furin Cleavage Site and S2′ Basic Residues Modulate the Entry Process in a Host Cell-Dependent Manner, J Virol, vol. 96, no. 13, pp. e00474-22, Jul. doi: (2022). 10.1128/jvi.00474-22
Hoffmann, M. et al. SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor. Cell. 181 (2), 271–280. 10.1016/j.cell.2020.02.052 (Apr. 2020). .e8.
Koch, J. et al. TMPRSS2 expression dictates the entry route used by SARS-CoV‐2 to infect host cells. EMBO J. 40 (16), e107821. 10.15252/embj.2021107821 (Aug. 2021).
Bestle, D. et al. Sep., TMPRSS2 and furin are both essential for proteolytic activation of SARS-CoV-2 in human airway cells, Life Sci. Alliance, vol. 3, no. 9, p. e202000786, doi: (2020). 10.26508/lsa.202000786
Li, X., Yuan, H., Li, X. & Wang, H. Spike protein mediated membrane fusion during SARS-CoV‐2 infection. J. Med. Virol. 95 (1), e28212. 10.1002/jmv.28212 (Jan. 2023).
Wang, L. & Xiang, Y. Spike Glycoprotein-Mediated Entry of SARS Coronaviruses. Viruses. 12 (11), 1289. 10.3390/v12111289 (Nov. 2020).
Silva, L. C. et al. Jul., Cephaeline is an inductor of histone H3 acetylation and inhibitor of mucoepidermoid carcinoma cancer stem cells, J Oral Pathology Medicine, vol. 51, no. 6, pp. 553–562, doi: (2022). 10.1111/jop.13252
Grollman, A. P. Inhibitors of Protein Biosynthesis. J. Biol. Chem. 243 (15), 4089–4094. 10.1016/S0021-9258(18)93283-7 (Aug. 1968).
Valadão, A. et al. Jun., Natural Plant Alkaloid (Emetine) Inhibits HIV-1 Replication by Interfering with Reverse Transcriptase Activity, Molecules, vol. 20, no. 6, pp. 11474–11489, doi: (2015). 10.3390/molecules200611474
Wong, W. et al. Cryo-EM structure of the Plasmodium falciparum 80S ribosome bound to the anti-protozoan drug emetine. eLife. 3, e03080. 10.7554/eLife.03080 (Jun. 2014).
Son, J. & Lee, S. Y. Emetine exerts anticancer effects in U2OS human osteosarcoma cells via activation of p38 and inhibition of ERK, JNK, and β-catenin signaling pathways. J. Biochem. Mol. Tox. 35 (10), e22868. 10.1002/jbt.22868 (Oct. 2021).
Siddique, M. A. H. et al. Nov., Identification of Emetine as a Therapeutic Agent for Pulmonary Arterial Hypertension: Novel Effects of an Old Drug, ATVB, vol. 39, no. 11, pp. 2367–2385, doi: (2019). 10.1161/ATVBAHA.119.313309
Marino, A. Electrocardiographic and Behavioral Effects of Emetine, Science, vol. 133, no. 3450, pp. 385–386, Feb. doi: (1961). 10.1126/science.133.3450.385
Schmiedl, S. et al. Oct., Cardiovascular effects, pharmacokinetics and cross-reactivity in digitalis glycoside immunoassays of an antidiarrheal uzara root extract, CP, vol. 50, no. 10, pp. 729–740, doi: (2012). 10.5414/CP201712
Yan, H. et al. Jun., Grifolin induces apoptosis and promotes cell cycle arrest in the A2780 human ovarian cancer cell line via inactivation of the ERK1/2 and Akt pathways, Oncology Letters, vol. 13, no. 6, pp. 4806–4812, doi: (2017). 10.3892/ol.2017.6092
Pan, H. et al. Dec., Repeated systemic administration of the nutraceutical alpha-linolenic acid exerts neuroprotective efficacy, an antidepressant effect and improves cognitive performance when given after soman exposure, NeuroToxicology, vol. 51, pp. 38–50, doi: (2015). 10.1016/j.neuro.2015.09.006
Rivera-Oliver, M. & Díaz-Ríos, M. Using caffeine and other adenosine receptor antagonists and agonists as therapeutic tools against neurodegenerative diseases: A review. Life Sci. 101, 1–2. 10.1016/j.lfs.2014.01.083 (Apr. 2014).
Venkata Charan Tej, G. N., Neogi, K., Verma, S. S., Chandra Gupta, S. & Nayak, P. K. Caffeine-enhanced anti-tumor immune response through decreased expression of PD1 on infiltrated cytotoxic T lymphocytes, European Journal of Pharmacology, vol. 859, p. 172538, Sep. doi: (2019). 10.1016/j.ejphar.2019.172538
Nomura, T. & Plager, J. E. Action of demecolcine (colcemid) in the murine sarcoma 180 tumor. Cancer Treat. Rep. 65, 3–4 (1981).
Yoon, S. J. et al. Decreased Glutamate/Glutamine Levels May Mediate Cytidine’s Efficacy in Treating Bipolar Depression: A Longitudinal Proton Magnetic Resonance Spectroscopy Study, Neuropsychopharmacol, vol. 34, no. 7, pp. 1810–1818, Jun. doi: (2009). 10.1038/npp.2009.2
Niu, K., Bai, P., Zhang, J., Feng, X. & Qiu, F. Cytidine Alleviates Dyslipidemia and Modulates the Gut Microbiota Composition in ob/ob Mice. Nutrients. 15 (5), 1147. 10.3390/nu15051147 (Feb. 2023).
Devereux, G. et al. Jul., Low-dose oral theophylline combined with inhaled corticosteroids for people with chronic obstructive pulmonary disease and high risk of exacerbations: a RCT, Health Technol Assess, vol. 23, no. 37, pp. 1–146, doi: (2019). 10.3310/hta23370
Sun, X. et al. Mar., Theophylline and dexamethasone in combination reduce inflammation and prevent the decrease in HDAC2 expression seen in monocytes exposed to cigarette smoke extract, Exp Ther Med, doi: (2020). 10.3892/etm.2020.8584
Frymoyer, A. et al. Theophylline dosing and pharmacokinetics for renal protection in neonates with hypoxic–ischemic encephalopathy undergoing therapeutic hypothermia, Pediatr Res, vol. 88, no. 6, pp. 871–877, Dec. doi: (2020). 10.1038/s41390-020-01140-8
Jarocka-Karpowicz, I. & Markowska, A. Therapeutic Potential of Jasmonic Acid and Its Derivatives. IJMS. 22 (16), 8437. 10.3390/ijms22168437 (Aug. 2021).
Koutová, D. et al. Chemical and Biological Aspects of Montanine-Type Alkaloids Isolated from Plants of the Amaryllidaceae Family. Molecules. 25 (10), 2337. 10.3390/molecules25102337 (May 2020).
Zhan, G. et al. Structurally diverse alkaloids with nine frameworks from Zephyranthes candida and their acetylcholinesterase inhibitory and anti-inflammatory activities. Phytochemistry. 207, 113564. 10.1016/j.phytochem.2022.113564 (Mar. 2023).
Malsy, M., Bitzinger, D., Graf, B. & Bundscherer, A. Staurosporine induces apoptosis in pancreatic carcinoma cells PaTu 8988t and Panc-1 via the intrinsic signaling pathway. Eur. J. Med. Res. 24 (1, p. 5, ). 10.1186/s40001-019-0365-x (Dec. 2019).
Meyer, F. A. H. et al. The Presence of Yin-Yang Effects in the Migration Pattern of Staurosporine-Treated Single versus Collective Breast Carcinoma Cells. IJMS. 22 (21), 11961. 10.3390/ijms222111961 (Nov. 2021).
Mojicevic, M. et al. Mar., Streptomyces sp. BV410 isolate from chamomile rhizosphere soil efficiently produces staurosporine with antifungal and antiangiogenic properties, MicrobiologyOpen, vol. 9, no. 3, p. e986, doi: (2020). 10.1002/mbo3.986
Kulawiak, B. & Szewczyk, A. Glutamate-induced cell death in HT22 mouse hippocampal cells is attenuated by paxilline, a BK channel inhibitor. Mitochondrion. 12 (1), 169–172. 10.1016/j.mito.2011.12.001 (Jan. 2012).
Kang, Y. J. et al. Paxilline enhances TRAIL-mediated apoptosis of glioma cells via modulation of c-FLIP, survivin and DR5. Exp. Mol. Med. 43 (1), 24. 10.3858/emm.2011.43.1.003 (2011).
Belyaeva, E. A. & Sokolova, T. V. Mitigating effect of paxilline against injury produced by Cd2 + in rat pheochromocytoma PC12 and ascites hepatoma AS-30D cells. Ecotoxicol. Environ. Saf. 196, 110519. 10.1016/j.ecoenv.2020.110519 (Jun. 2020).

No competing interests reported.

Download PDF

Editorial decision: Revision requested
07 Oct, 2024
Reviews received at journal
04 Oct, 2024
Reviewers agreed at journal
17 Sep, 2024
Reviews received at journal
16 Sep, 2024
Reviewers agreed at journal
12 Sep, 2024
Reviewers invited by journal
12 Sep, 2024
Editor assigned by journal
12 Sep, 2024
Editor invited by journal
05 Sep, 2024
Submission checks completed at journal
03 Sep, 2024
First submitted to journal
03 Sep, 2024

You are reading this latest preprint version

Discovering Natural Products as Potential Inhibitors of SARS-CoV-2 Spike Proteins

Status:

Version 1

Abstract

Figures

1. Introduction

2. Materials and Methods

3. Results and Discussions

3.1 Cluster Analysis of Spike Glycoproteins

3.2 Identification of Binding Molecules from BindingDB

3.3 Finding Secondary Metabolites from the KNApSAcK Database

3.4 Docking Analysis of Secondary Metabolites

3.5 Bioavailability Analysis of Secondary Metabolites

3.5.1 Secondary Metabolites for Cluster 1

3.4.2 Secondary Metabolites for Cluster 2

3.4.3 Secondary Metabolites for Cluster 4

4. Conclusion

Declarations

Conflict of interest.

Funding.

Author Contribution

Acknowledgement

Data Availability

References

Additional Declarations

Supplementary Files

Status:

Version 1