Overview of DCABM-TCM
DCABM-TCM can be freely assessed by http://bionet.ncpsb.org.cn/dcabm-tcm/. Its core data are constituents absorbed into blood and metabolites of TCM prescriptions and herbs together with the corresponding detailed detection conditions. Around the core data, annotation data include physicochemical properties, ADMET properties of blood constituents and the associated targets, GO functional terms, pathways, and diseases etc. Therefore, in total, in DCABM-TCM there are six kinds of entities: prescriptions, herbs, blood constituents, targets, pathways, and diseases (Figure 2). For every entity of these six types, DCABM-TCM provides a detailed annotation page presenting the related core and annotation information. For each of the six kinds of entities, DCABM-TCM supports data browse and search. DCABM-TCM also supports two data analysis functions: 1) the network pharmacology analysis for a prescription/herb/blood constituent to reveal the potential molecular mechanism, 2) the screening of candidate drugs (including blood constituents, herbs and prescriptions) potentially targeting a target/pathway/disease to help TCM-derived drug discovery. The two analysis functions and corresponding analysis results are presented on the detailed annotation page of the corresponding type of entity. In addition, data download and submission are also supported by DCABM-TCM.
Data statistics and analyses
Currently DCABM-TCM has recorded 4206 constituents detected in blood (among which 1306 can be mapped to PubChem CIDs) of 192 prescriptions and 194 herbs, including 1487 prototypes (703 mapped to PubChem CIDs) and 1783 metabolites (184 mapped to PubChem CIDs) (Table 1) (Additional files 1~3, and they can also be downloaded on “Download” page of DCABM-TCM). In our data, only 1/4 of source publications have stated that among the detected blood constituents of a prescription or herb which are prototypes and which are metabolites. DCABM-TCM involved 7585 prescription/herb-blood constituent associations. In addition, there were 3838 target genes, further 333 KEGG pathways and 3987 CTD diseases associated with the constituents detected in blood (Table 1).
Table 1 Statistics of DCABM-TCM data
Data type
|
Number
|
Prescriptions
|
192
|
Herbs
|
194
|
Constituents detected in blood (All)
(including prototypes and metabolites)
|
4206
|
Constituents detected in blood with PubChem CIDs
(including prototypes and metabolites)
|
1306
|
Prototypes (All)
|
1487
|
Prototypes with PubChem CIDs
|
703
|
Metabolites (All)
|
1783
|
Metabolites with PubChem CIDs
|
184
|
Target genes (including the known and predicted ones with scores >= 10)
|
3838
|
Constituent detected in blood - target gene associations
|
47374
|
KEGG pathways involving the target genes
|
333
|
Target gene - KEGG pathway associations
|
17136
|
CTD diseases associated with the target genes
|
3987
|
Target gene - CTD disease associations
|
72794
|
In our data, the median (/average) number of constituents detected in blood is 13 (/22.5) for a prescription and 8 (/16.8) for a herb (Figure 3A and 3B). The median (/average) number of blood constituents with mapped PubChem CIDs is 9 (/11.7) for a prescription and 5 (/7.7) for a herb. By observation, the number of the detected blood constituents of a prescription was often smaller than the sum of the detected blood constituents of its compositive herbs. Most blood constituents detected in a prescription were also the blood constituents of its compositive herbs, while sometimes a few new blood constituents appeared which might be produced by the interactions between its compositive herbs such as those produced in the decocting process of the prescription. These data of constituents detected in blood were mined from 443 papers. In our data, for the majority of prescriptions and herbs, its blood constituents were studied by only one paper, while some of them were studied by multiple papers. On average, the blood constituents of a prescription were studied by 1.5 papers, and 2.8 papers for a herb (Figure 3C and 3D).
The recorded detection conditions mainly included the extraction method, experimental animal and animal model, administration method and dose, blood collection time and location. In our data, the extraction methods included water extraction, ethanol extraction, methanol extraction etc. Experimental animals included rats, mice, and rabbits. For the animal model, the majority of the researches used normal models and the minority used disease models such as the models of rheumatoid arthritis, acute heart failure, and cerebral ischemia-reperfusion injury. For the administration method, the vast majority of studies adopted the intragastrical administration (i.e., ig), and the adopted other methods included intravenous injection (i.e., iv), intraperitoneal injection (i.e., ip), intravenous drip, intraduodental administration, intestinal circulatory perfusion etc. The administration dose were recorded typically in three forms: 1) “XX g/kg”, representing a single administration; 2) “XX g/kg, XX times”, representing multiple dosing in a short period of time; 3) “XX g/kg, XX times/day, XX days”, representing a consecutive administration for XX days. The blood collection time was generally in 0.5~3 hours after the last administration, divided into, the single time point and multiple time points, two cases. Finally, blood collection locations mainly included the postorbital venous plexus, the abdominal aorta, the hepatic portal vein, the eyeball blood, and the fosse orbital vein etc.
At last, we described the physicochemical property distributions of 701 absorbed prototype constituents with structures of the prescriptions and herbs by oral administration (i.e., intragastrical administration) in our data (Figure 4). We observed that most absorbed constituents by oral administration satisfied the traditional rule for the drug-like molecule screening (molecular weight <= 500, logP <= 5, hydrogen bond donor count <= 5, hydrogen bond acceptor count <= 10, rotatable bond count <= 10, TPSA <= 140 [34]), but indeed there were still many that did not satisfy the rule, suggesting the imperfection of the traditional rule in estimating the absorption and permeation of molecules.
Usage of DCABM-TCM
DCABM-TCM supports data browse, search, download, and submission. In addition, it also supports two data analysis functions as described in the next section.
For each of the six kinds of annotation pages, including prescriptions, herbs, constituents detected in blood, targets, pathways, and diseases, DCABM-TCM supports browse and search. Specially, for blood constituents, only those with mapped PubChem CIDs have their own annotation pages and are supported to be browsed and searched. All blood constituents can be viewed on the corresponding detailed annotation pages of prescriptions/herbs as described in the next section.
Among all the six kinds of entities that are searchable (Figure 5A), for the prescription or herb, users can search by English name or Pinyin name; for the target, by Entrez Gene ID, Gene symbol or Gene full name; for the disease, by Disease name or CTD disease ID; for the pathway, by KEGG pathway name or ID. Finally, for the constituent detected in blood, DCABM-TCM supports search by 1) Compound name or PubChem CID; 2) Structural similarity; 3) Physicochemical property range; or 4) Compound classification. When searching by the structural similarity, users input a compound of InChI format or draw the structure of a compound with the help of JSDraw [35], and meanwhile set the structural similarity cutoff (Figure 5B), then structurally similar blood constituents in the order of decreasing similarity scores ( >= cutoff) will be returned. When searching by the property range, users can specify the range of the physicochemical properties of the returned blood constituents, including the molecular weight, logP, TPSA, hydrogen-bond donor count, hydrogen-bond acceptor count, and rotatable bond count (Figure 5C). All blood constituents with structures in DCABM-TCM were divided into 12 superclasses, given by a histogram graph on the “Compound classification” Search page of the blood constituent. Selecting a column will return blood constituents belonging to that superclass (Figure 5D).
On the submission page, users can submit the constituents detected in blood of a prescription or herb derived from their own research or other publications to DCABM-TCM, by simply filling information in some required and optional fields. Periodically, after manual verification, we will integrate them into DCABM-TCM.
Annotation page
For every entity of the six types, including prescriptions, herbs, blood constituents, targets, pathways, and diseases, DCABM-TCM provides a detailed annotation page presenting the related core and annotation information as well as the corresponding interactive analysis functions and analysis results.
For a prescription/herb, on its detailed annotation page, first, we give its basic information, including Chinese/English/Pinyin/Latin name, cross-references to TCMID [7] and CMAUP [17] (Figure 6A). Then, as a reference, its ordinary constituents generally detected in vitro, integrated from TCMID [7], are given, where for a prescription, the ordinary constituents of its every compositive herb and their sum are given (Figure 6B). Next, the constituents detected in blood are provided (Figure 6C). Here we provide the constituents detected in blood (as well as the prototypes and metabolites in them if the corresponding source publication gave) and their corresponding detailed detection conditions and source publications of the prescription/herb. Considering the potential difference of the blood constituents detected by different researches as a result of the difference of the detection conditions, here, we give the blood constituent-related information from different source publications, respectively, which can be browsed by the label switch of “Source1”, “Source2”, …. The content of the label “Sum” is the sum of blood constituents from different sources. In the next column, other prescriptions and herbs in DCABM-TCM which share blood constituents with the interested prescription/herb together with the shared blood constituents are listed (Figure 6D). Finally, at the end of the annotation page, the network pharmacology analysis function and analysis results are presented (Figure 6E). This analysis is implemented based on BATMAN-TCM, a bioinformatics analysis tool for the molecular mechanism of TCM previously developed by us [26]. In this function, known and predicted targets of the blood constituents of the prescription/herb, further the enriched GO functional terms, KEGG biological pathways, and CTD/OMIM diseases among the targets are analyzed. And the blood constituent-target-pathway/disease association network is visualized. Here, the target prediction score cutoff and the P-value of the enrichment analysis after multiple testing correction can be interactively changed and then the results will be re-analyzed. All analysis results as well as the network graph can be downloaded. This function aims to reveal the potential molecular mechanism of the prescription/herb based on its blood constituents. We believe that compared with all constituents of a prescription/herb, the results of the network pharmacology analyses based on its blood constituents are more reliable for revealing its potential molecular mechanism.
For a constituent detected in blood, its annotation page gives basic information (including name, molecular formula, PubChem CID, CAS number, structure, cross-references), physicochemical properties, compound structural classification, ADMET properties, the lists of prescriptions and herbs the blood constituent belongs to, and the network pharmacology analysis results for the blood constituent.
For a target/pathway/disease, the content of its annotation page is only divided into two sections (Figure 7). One gives the basic information of the entry, including name, ID, cross-references, GO function terms a target entry belongs to, pathways a target entry participates in, diseases a target entry is associated with, member genes a pathway entry contains, a disease entry-related genes etc. (Figure 7A). The other is the analysis function of the candidate prescriptions, herbs, blood constituents targeting the target/pathway/disease (Figure 7B). The candidate blood constituents of a target are given based on known and predicted drug-target associations provided by BATMAN-TCM. The candidate blood constituents potentially targeting a pathway/disease are significantly enriched blood constituents among the pathway member genes/the disease-related genes. Further the candidate prescriptions/herbs potentially targeting a target/pathway/disease are significantly enriched ones among its candidate blood constituents (see Materials and Methods). Here users can specify the target prediction score cutoff and P-value of the enrichment analyses after multiple testing correction. This function aims to help the target/pathway/disease-based screening of candidate blood constituents, herbs and prescriptions.
In addition, for the user-friendliness, on each of the annotation pages, we provide various hyperlinks to external databases such as TCMID [7], CMAUP [17], DrugBank [24], ChEBI [36], KEGG [28], DGIdb [37], OMIM [29], Ensembl [38], and CTD [30], to internal pages and to the Help document in proper positions.