Microarray data
We extracted gene expression (GSE66525) profiling data from the Gene Expression Omnibus (GEO) database at the National Center for Biotechnology Information. The AML-associated dataset GSE66525 submitted by Wieser R based on the GPL11532 platform was obtained from the GEO database and includes 11 AML primary samples and 11 AML recurrence normal samples.
Identification of differentially expressed genes (DEGs)
The limma package is a core component of Bioconductor, an R-based open-source software development project in statistical genomics[27]. The package is designed in such a way that, after initial pre-processing and normalization, the same analysis pipeline is used for data from all technologies. For the data from GEO, the R package limma was applied to perform analysis to identify the DEGs. The “R” software was applied to construct heat maps and Volcano map, and the regions in which the differential genes were mainly concentrated were highlighted.
Identification of Biologically active ingredients in jiedu huayu recipe
All components of the eight Chinese medicinal herbs in jiedu huayu recipe (Indigo Naturalis, Pseudobulbus Cremastrae Seu Pleiones, Paris polyphylla, Polygonum cuspidatum, Curcuma Zedoaria, Ligusticum chuanxiong Hort, Salvia Miltiorrhiza and Psoraleae Fructus) were retrieved from the traditional Chinese medicine systems pharmacology (TCMSP) database (http://tcmspw.com/)[28]. In drug absorption, distribution, metabolism, and excretion (ADME) processes, oral bioavailability (OB) is one of the most significant pharmacokinetic parameters[29]. As a qualitative concept applied in drug design to estimate the druggability of a molecule[30], the drug-likeness (DL) index is useful for rapid screening of active substances. The active components were the filtered by combining oral bioavailability (OB) ≥30% and drug-likeness (DL) index ≥0.18 as suggested by the TCMSP database.
Prediction of Drug Targets for Biologically active ingredients in jiedu huayu recipe
The protein targets of the active substances in jiedu huayu recipe were retrieved from the TCMSP database and the traditional Chinese medicine integrated database (TCMID, http://www.megabionet.org/tcmid/).
Screening of overlapping genes between DEGs and drug targets of biologically active ingredients
The target gene of the active ingredient of the drug and the differential gene in the GEO database are analyzed by Perl software, and the overlapping part of the two differential genes is defined as a common differential gene, and its expression level in the GEO database is stored for subsequent analysis.
Functional enrichment analyses for overlapping genes
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and utilities of the biological system from molecular-level information. The Gene Ontology (GO) could be used to perform enrichment analysis. We used DAVID (https://david.ncifcrf.gov/) to make KEGG pathway analysis and GO enrichment analysis.
Protein–protein interaction (PPI) network construction
PPI analysis is used to search core genes and gene modules related to carcinogenesis. In this study, PPI network analysis of the overlapping genes were performed using the search tool for the Retrieval of Interacting Genes (STRING) database.
Prognostic Analysis and Gene Correlation Analysisin GEPIA
Gene Expression Profiling Interactive Analysis (GEPIA), a web-based tool to deliver fast and customizable functionalities based on TCGA and GTEx data, provides key interactive and customizable functions including differential expression analysis, profiling plotting, correlation analysis, patient survival analysis, similar gene detection and dimensionality reduction analysis[31]. We used GEPIA to analyze the prognosis of 17 overlapping genes. P-values <0.05 were considered statistically significant. Then we used prognostic-related genes and 17 overlapping genes for genetic correlation analysis. The correlation of gene expression was evaluated by Spearman's correlation and statistical significance, and the strength of the correlation was determined using R>0.5.
Molecular docking
Molecular docking is a key tool in structural molecular biology and computer-assisted drug design[32]. We use molecular docking to predict the combination of active ingredients of drugs and differential genes. First of all, we prepared 3D structure of targets and compounds which were mined from Protein Data Bank (PDB) (https://www.rcsb.org/) and PubChem (https://pubchem.ncbi.nlm.nih.gov/) databases[33, 34]. Then, the AutoDock tool was applied to perform a molecular docking[35]. Finally, PyMOL 2.3.2 software was used for visual processing to check the binding status of ligands and receptor binding sites[36].