There were 11 kinds of ingredients in FBSJ, including flavonoids, phenolic acids, organic acids, amino acids and their derivatives, terpenoids, and tannins.27,28 In addition to flavonoids such as rutin, quercetin, kaempferol and some saponins, some key active ingredients in FBSJ have not been specified to promote health. As a result, there was unclear function for non-flavonoid metabolites which was not contribute to the function of promoting human health.
The identified metabolites were queried in the TCMSP database to identify key active components in the FBSJ with health-promoting functions for the human body. Of the 1559 identified metabolites, 294 belonged to the active components of TCM. With OB ≥ 5% and DL ≥ 0.14 as screening criteria, the analysis identified 135 key active components (Table. 2). For Rutin (OB = 3.2%, DL = 0.68), OB and DL values were lower than the screening criteria, however, as a recognized health promotion component, it was considered as a key active component of FBSJ.
Among the 135 key active ingredients of TCM, flavonoids were the majority, total 87 kinds, accounting for 64.4% of all the key active ingredients of TCM. In addition, there were 14 lipids; 8 phenolic acids; 7 terpenoids; 6 nucleotides and their derivatives; 1 tannin; 1 quinone; and 5 vitamins, flavonoids and other substances. Notably, 12 metabolites with no relevant protein target information but very high DL values (DL > 0.65) were 6'-O-malonyl bebeoside, conazole H, rescinin A-7-O-glucoside (Indian santalin), pyrin, terpenoid tritol, kaempferol-7-O-rhamnoside, 3,24-dihydroxypier-12-ene-22 ketol (soy alcohol E), dehydrogated soy saponin I*, apigenin. This indicates that these metabolites may have significant health-promoting value and could be further used to develop new foods and medicines.
3.3 Identification of active pharmaceutical ingredients for 11 human diseases
In the CancerHSP and TCMSP databases, the top six diseases including cancer/tumor, diabetes, hypertension, cardiovascular disease, atherosclerosis and thrombotic diseases, and five other related diseases including osteoporosis, liver ischemic injury, inflammation, infectious diseases, and hemorrhage, were selected to represent metabolites contributing to a pharmacological role of FBSJ extract. The active ingredients in FBSJ against these eleven diseases were identified to provide more information on evaluating the preventive and therapeutic effect of FBSJ on these eleven diseases. From 294 traditional Chinese medicine ingredients identified, a total of 193 metabolites were found to be related to the diseases mentioned earlier. Most of them were flavonoids, including 90 flavonoids, 37 phenolic acids, 15 organic acids, 12 lipids, 9 coumarins and lignans, 6 terpenoids, 4 alkaloids, 4 amino acids and their derivatives, 3 nucleotides and their derivatives, 3 chromones, 3 vitamins, 2 proanthocyanidins, 1 quinone, 1 ketone, 1 alcohol, and 2 sugars (SI. 3). Therefore, the possible preventive and therapeutic effect of FBSJ on the above 11 diseases or the material basis of its therapeutic effect on these diseases were speculated. These 193 metabolites were found to have associated effects with 306 target proteins corresponding to 335 diseases, which means that FBSJ has more potential pharmacological effects contributing to 11 diseases.
3.4 Differential metabolites in FBSJ from different origins
3.4.1 PCA and HCA cluster analysis
To identify and better understand the differences of metabolites of FBSJ in different origins, principal component analysis (PCA) and hierarchical clustering analysis (HCA) were first performed. As shown in Fig. 2(Fig. 2A), The PCA analysis of FBSJ from four different origins was significantly distinguished PC1 and PC2 contribution rate of 49.78% and 14.17%, respectively. The overall PCA results can reflect the differences of metabolites in Shandong, Anhui, Henan and Hebei Province. SJsd from Shandong has the greatest difference from other three countries in PC1. Although SJhn from Henan and SJah from Anhui has seemed to be relatively close, the 3D plot in Fig. 2(Fig. 2B) shows that the two groups of these samples have a clear separation trend in terms of PC3. Biological replicates from the same origin clustered well with each other together in PCA 3D plot. This unsupervised PCA cannot ignore within-group errors and eliminate random errors irrelevant to the research purpose,29 so it was not conducive to find group differences. Further research would be conducted using supervised methods. To eliminate the influence of quantity on the recognition pattern, the peak area of each metabolite in the sample was transformed logarithmically, and then HCA was performed. As shown in Fig. 2(Fig. 2C), the HCA of samples were distinguished by color. There was a clear separation between SJsd (Shandong) and others. The metabolites between SJhn (Henan) and SJah (Anhui) was firstly gather in one group, then gather with SJhb (Hebei) and finally converge with SJsd (Shandong). This indicates the difference in phenotype between the four groups which were consistent with PCA analysis.
The metabolic profiles of SJah, SJhn, and SJhb were somewhat different from those of SJsd. PCA and HCA analysis showed that the four Styphnolobium japonicum plants belonged to different groups and had different metabolic characteristics. Among all of them, SJsd had a larger difference from the other three Styphnolobium japonicum plants.
Cluster analysis method was used to analyze the different metabolites of FBSJ from different origin to get the cluster heat map of four groups of samples (Fig. 2D). The Z-score of differential metabolites in FBSJ from each geographic origins were significantly different. Interestingly, the z-score of flower buds from Shandong origin (SJsd) with more than 100 ingredients was significantly lower than that of flower buds from the three other origins. But more than that, there are more than 100 ingredients with significantly higher z-scores than the other three origins of flower buds from SJ.
3.4.2 OPLS-DA analysis
Through OPLS-DA, orthogonal variables unrelated to categorical variables in metabolites can be filtered out, and non-orthogonal variables and orthogonal variables can be separately analyzed as to obtain more reliable information about inter-group differences of metabolites and the degree of correlation of experimental groups.30 To identify specific differential metabolites caused the separation of FBSJ from four different origins of China, four groups of comparative OPLS-DA models were established(Fig. 2E). The R2X and R2Y values indicate the interpretation rate of X and Y using the model, respectively. and The Q2 value indicates the prediction ability of the model with Q2 > 0.9 for excellent model, and Q2 > 0.5 for effective model. The model prediction ability Q2 in this study was 0.727, and the P value was 0.005, which means only a total of one randomized grouping model had better predictive ability than the present OPLS-DA model in this Permutation test, indicating that the OPLS-DA model was well constructed, reliable and meaningful. And R2X and R2Y of the OPLS-DA models were observed for comparisons as 0.549 and 0.996 respectively, showing that the interpretation rate of X is good, and the interpretation rate of Y is excellent. The P-value of R2Y was less than 0.005, showing that there is no random grouping model in this Permutation test whose explanation rate of Y matrix is better than the present OPLS-DA model.
To evaluate the difference in the relative content of metabolites in the different groups, four kinds of FBSJ were clearly distinguished on the OPLS-DA score chart (Fig. 2F). The results revealed that four sample groups were well separated indicating that there were significant differences associated with Y (33.8%) among each group in the first predicted principal component. Compared with other origins, there were no-significant differences between SJsd and SJah rather than others in the second principal component not associated with Y (21.2%). This was consistent with the results of PCA and HCA analysis.
3.4.3 Screening of differential metabolites of FBSJ from different origins
Based on the results of OPLS-DA, pair-based comparison of different producing areas was conducted under the conditions of VIP > 1 and p < 0.05, and the different metabolite quantities were obtained (Table. 3). It can be seen that the amount of differential metabolites between SJsd and other origins was much higher than that of among other origins. Meanwhile, 12 types of 708 different metabolites were screened out by comparison between the four groups, including 183 flavonoids (51 flavonoids, 48 isoflavones, 43 flavonols, 10 other flavonoids, etc.), 108 phenolic acids, 88 amino acids and their derivatives, 81 lipids (46 free fatty acids, 11 Lysophosphatidyl ethanolamine, 12 lysophosphatidylcholines, etc.), 45 alkaloids (8 indole alkaloids, 3 piperidine alkaloids, 2 pyridine alkaloids, etc.), 45 organic acids, 36 terpenoids (30 triterpenoid saponins, 6 triterpenoids), 26 nucleotides and their derivatives, 21 lignans and coumarins (11 lignans, 10 coumarins), 6 tannins, 5 quinones and 64 other metabolites. Among these different metabolites in FBSJ, flavonoids, phenolic acids, amino acids and their derivatives, and lipids accounted for 25.85%, 15.25%,12.43%, and 11.44%, respectively. The results indicated that 143 metabolites was upregulated with the highest number in SJsd vs SJhb, and 160 metabolites was downregulated with the highest number in SJsd vs SJah. 2 metabolites was upregulated with the lowest number in SJah vs SJhn and 3 metabolites was downregulated with the lowest number in SJhn vs SJhb.
Table 3
Count of significantly different metabolites by two-by-two comparisons of FBSJ in four groups
| All differential metabolites (VIP > 1 and p < 0.05) | | 135 metabolites as key active ingredients (FC ≥ 2 and FC ≤ 0.5) |
Group name | All sig diff | Down regulated | Up regulated | | All sig diff | Down regulated | Up regulated |
SJah vs SJhb | 41 | 13 | 28 | | 7 | 2 | 5 |
SJah vs SJhn | 6 | 4 | 2 | | 0 | 0 | 0 |
SJhn vs SJhb | 25 | 3 | 22 | | 3 | 2 | 1 |
SJsd vs SJah | 260 | 160 | 100 | | 35 | 35 | 0 |
SJsd vs SJhb | 253 | 110 | 143 | | 21 | 17 | 4 |
SJsd vs SJhn | 236 | 147 | 89 | | 30 | 30 | 0 |
SJsd vs SJah vs SJhn vs SJhb | 708 | 0 | 0 | | | | |
Among the 708 differential metabolites in FBSJ, 135 metabolites were belong to the key active ingredients of TCM. Among these key active ingredients,102 metabolites were anti-disease active ingredients for 11 diseases, including 4 phenolic acids, 3 nucleotides and their derivatives, 73 flavonoids, 8 lipids, 5 lignans and coumarins, 3 terpenoids, 2 chromones, 1 quinone, 1 vitamin, 1 ketone, and 1 tannin. These metabolites were potential key and signature health-promoting compounds that distinguished FBSJ from different origins in this study. Diseases include the top 6 diseases of cancer/tumor, diabetes, hypertension, cardiovascular diseases, atherosclerosis and thrombotic diseases, and 5 diseases such as osteoporosis, liver ischemic injury, inflammation, infectious diseases and hemorrhage. To further understand the content difference trend of these potential marker compounds in FBSJ from four origins, the dominant differential metabolites in FBSJ from different origins were screened with the difference ratio of metabolite expression. The fold change (FC) values of the metabolites in the comparison group were calculated as FC ≥ 2 or FC ≤ 0.5, and a total of 46 dominant metabolites were obtained. These included 35 flavonoids, 4 coumarins and lignans, 3 nucleotides and their derivatives, 2 phenolic acids, 1 terpenoid, and 1 tannin. As shown in Table. 3, SJsd exhibited more dominant metabolites, and there were 35, 17, and 30 kinds of Down regulated metabolite species more dominant between SJsd vs SJah, SJsd vs SJhb, and SJsd vs SJhn, respectively.
Moreover, the expression of 13 differential metabolites, which could anti-disease active ingredients for 11 diseases, in SJsd was significantly higher than that in the other three groups (FC ≥ 2) (Fig. 3), including 12 flavonoids and 1 terpenoid, such as Diosmetin, 3'-Methoxydaidzein, Ononin, Pectolinarigenin, 7,4'-Di-O-methyldaidzein, Apigenin-7, 4'-dimethyl ether, Luteolin, Maackiain, Isoliquiritin, Medicarpin, Liquiritigenin, 3,5,6,7,8,3',4'-Heptamethoxyflavone, Corosolic acid methyl ester. The differential metabolites in SJah, SJhb and SJhn were not significantly different according to their FC values in the other three groups.
3.5 Differential metabolite KEGG metabolic pathway analysis
The pathway enrichment analysis of 708 different metabolites in the four groups of FBSJ was conducted by KEGG database using characteristics of the differential metabolites. Overall, 382 different metabolites were annotated by KEGG, and the number of significantly different metabolites was 176. It was distributed in 87 metabolic pathways. Among them, the most significant enrichment difference was the isoflavone biosynthesis pathway (p < 0.01), followed by neomycin, kanamycin and gentamicin biosynthesis (p < 0.05), starch and sucrose metabolism, aminoacyl-tRNA biosynthesis, linoleic acid metabolism, indole alkaloid biosynthesis pathway (p > 0.05).
Based on the KEGG database, the metabolic pathway enrichment analysis of the three contrast combinations SJsd vs SJah, SJsd vs SJhb, and SJsd vs SJhn with a large amount of differential metabolites was performed. This was helpful to understand the mechanism of the changes of differential metabolites in metabolic pathways. Differential Abundance (DA) Score was a pathway-based metabolic change analysis method, and differential abundance score can capture the overall change of all metabolites in a pathway. Different from the KEGG enrichment bubble map, the differential abundance score map increases line segments, the length of which represented the absolute value of DA Score, and the size of the dots at the end of the line segments represented the number of differential metabolites in the pathway. The dots were distributed to the left of the central axis, and the longer the line segment, the more inclined the overall expression of the pathway was to be down regulated.
A total of 260 different metabolites in SJsd vs SJah were annotated into 51 metabolic pathways, and the metabolic pathway with a high amount of enriched metabolites and significant difference (p < 0.05) was isoflavonoid biosynthesis (Fig. 4A). Notably, there were 21 metabolites involved in this metabolic pathway, including 7,4 "-dihydroxyflavone, isoformononetin, 6"-O-Malonyldaidzin, pseudobaptigenin, Daidzein, 6"-O-Malonylglycitin, formononetin-7-O-glucoside (Ononin), formononetin-7-O-(6''-malonyl)glucoside, glycitein, genistein, daidzein-7-O-glucoside (Daidzin), 2'-hydroxygenistein, calycosin, apigenin; 4',5,7-trihydroxyflavone, biochanin A, liquiritigenin, maackiain, prunetin (5,4'-dihydroxy-7-methoxyisoflavone), formononetin, (7-hydroxy-4'-methoxyisoflavone), 3, 9-dihydroxypterocarpan, medicarpin. The overall expression of SJah in this metabolic pathway demonstrated a greater down-regulation trend than that of SJsd.
For the shared different metabolites between SJsd and SJhb, 253 different metabolites of SJsd vs SJhb were annotated in KEGG database and assigned into 57 metabolic pathways (Fig. 4B). The metabolic pathways with high concentrations of metabolites and significant differences (p < 0.05) were pyrimidine metabolism, isoflavonoid biosynthesis, purine metabolism. It can be seen that the overall expression of SJah in isoflavonoid biosynthesis had a larger down-regulation trend compared with SJs participated in isoflavonoid biosynthesis. The differential metabolites of biosynthesis consisted of flavonoids including 7,4'-dihydroxyflavone, medicarpin, formononetin-7-O-glucoside (ononin), 3,9-dihydroxypterocarpan, daidzein, 6 "-O-malonylglycitin, maackiain, liquiritigenin. In addition, the overall expression of SJah in pyrimidine metabolism, purine metabolism, and nucleotide metabolism exhibited relatively to be up-regulated compared with that of SJsd. These three pathways were related to nucleotide metabolism. Moreover, the differential metabolites involved in pyrimidine metabolism were nucleotides and their derivatives, and organic acids. The organic acids consisted of malonic acid and 3-hydroxypropanoic acid; nucleotides and derivatives: 2'-deoxycytidine, uridine 5'-monophosphate, barbituric acid, malonylurea; 2,4,6-pyrimidinetrione, cytidine 5'-monophosphate (cytidylic acid). The differential metabolites involved in purine metabolism were nucleotides and derivatives, including guanosine 5'-monophosphate, cyclic 3',5'-adenylic acid, 2'-deoxyadenosine, 2'-deoxyinosine, guanosine, guanosine 3',5'-cyclic monophosphate. Differential metabolites involved in nucleotide metabolism contributed to guanosine 5'-monophosphate, cyclic 3',5'-adenylic acid, 2'-deoxyadenosine, 2'-deoxyinosine, guanosine, guanosine 3',5'-cyclic monophosphate.
For the shared different metabolites between SJsd and SJhn, A total of 236 different metabolites were assigned into 54 metabolic pathways (Fig. 4C). Among these metabolic pathways, isoflavonoid biosynthesis were a metabolic pathway with high concentration of metabolites and significant difference (p < 0.05). In this metabolic pathway, 19 commondifferent metabolites were 7,4'- dihydroxyflavone, genistein, formononetin-7-O-(6'-malonyl) glucoside,3,9-dihydroxypterocarpan, formononetin (7-hydroxy-4'-methoxyisoflavone), prunetin (5,4'-dihydroxy-7-methoxyisoflavone), formononetin-7-O-glucoside (ononin), maackiain, apigenin; 4',5, 7-trihydroxyflavone, medicarpin, daidzein-7-O-glucoside(daidzin), 2'-hydroxygenistein, calycosin, 6''-O-malonylglycitin, pseudobaptigenin, daidzein, liquiritigenin, biochanin A, 6 "-O-malonyldaidzin. The overall expression of SJhn in this metabolic pathway has a greater down-regulation trend compared with SJsd. These KEGG annotation results indicated that the origin of SJsd belonged to the temperate continental monsoon climate, which might be favorable to the accumulation of isoflavones and fatty acids, and the origin of SJhn and SJhb could be favorable to the accumulation of ether lipid due to the differences in temperature, precipitation, and soil of Sophora japonica samples. The specific reasons might be the difference in temperature, precipitation, soil of the origin of Sophora samples leading to the difference in metabolic pathways and metabolic mechanisms.
Different metabolites were annotated as active pharmaceutical ingredients which could provide comprehensive information in health-promoting function of human. However, due to the many factors affecting the growth of FBSJ in different origins, it was necessary to explore the comparative analysis of samples from different seasons and harvesting processes, to provide more comprehensive information for planting, product development and processing of FBSJ.