QSAR study
Quantitative structure-activity relationship (QSAR) analysis is one of the classical techniques in the field of computer-aided drug design (CADD) and the field of active international research. It is mainly based on using various molecular descriptors and algorithms to establish quantitative relationships between the structures of compounds and their physicochemical properties. In this paper, a 2D-QSAR study was performed using a support vector regression (SVR) model described by the radial basis kernel function and the designed artificial neural network (ANN) model, and a 3D-QSAR study using comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA)-based models was also undertaken. The structures of the 64 sophoridine derivatives and their activity data against HepG-2 cells were collected from Yiming Xu's article[20, 21] and PhD thesis[22] for the QSAR study.
2D-QSAR study
The two-dimensional structure-activity relationship (2D-QSAR) method utilizes the structural properties of molecules as a whole as a parameter (regardless of the three-dimensional structure), performs regression analysis of the physiological activity of the molecule (activity, toxicity, pharmacokinetic properties, etc.), and quantifies the correlation between their chemical structure and physiological activity. 2D-QSAR develops QSAR models using 2D descriptors but can also characterize molecules in 3D to some extent[23].
To determine the effect of structure on the anticancer activities of the compounds, a 2D-QSAR study was performed using a SVR model described by radial basis kernel functions and a designed ANN model. First, the 64 collected compounds were used as the training set, and the anticancer activity data (IC50 values) of the compounds were converted to pIC50 (i.e., -logIC50) values to be used as the dependent variable in the QSAR study. Optimization was then performed, and a total of 300 descriptors for the 64 compounds in the training set were calculated using MOE 2008.10 software, followed by dimensionality reduction using the voting method[24]. Finally, 16 descriptors with the highest correlation (Table 4) were selected as independent variables in the QSAR study. The SMILES notation of the compounds in the training set and their corresponding values of descriptors are visible in Supplementary Table S2.
Table 4 QSAR descriptors used in the study
QSAR descriptor
|
Meaning of the descriptor a
|
weinerPol
|
Weiner polarity number (2D)
|
logP(o/w)
|
Log octanol/water partition coefficient (2D)
|
vsurf_G
|
Surface globularity (i3D)
|
SlogP
|
Log octanol/water partition coefficient (2D)
|
a_hyd
|
Number of hydrophobic atoms (2D)
|
E_nb
|
Nonbonded energy (i3D)
|
pmi
|
Principal moment of intertia (i3D)
|
E_vdw
|
van der Waals energy (i3D)
|
zagreb
|
Zagred index (2D)
|
vsurf_D2
|
Hydrophobic volume at -0.4 (i3D)
|
ASA
|
Water accessible surface area (i3D)
|
ASA_H
|
Total hydrophobic surface area (i3D)
|
vsurf_D3
|
Hydrophobic volume at -0.
|
PM3_LUMO
|
LUMO energy (ev) (i3D)
|
PM3_HOMO
|
HOMO energy (ev) (i3D)
|
PM3_LUMO - HOMO
|
LUMO - HOMO
|
a The value and meaning of the descriptor were obtained from MOE 2008.10
A support vector regression model was developed, which chose the radial basis kernel function as the kernel function. The parameters were adjusted by cross-validation to obtain a higher value of q2. A higher value of q2 indicates better predictive ability of the model, and in general, q2 ≥ 0.4 is desirable[25]. The parameters and establishment results of the model were as follows: Epsilon = 0.08, C = 15, q2 = 0.77326, RMSE = 0.24145, and r = 0.88045, where RMSE denotes the root mean square error and r denotes the correlation coefficient between the predicted and actual values. As the results show, the established model had superior predictive ability.
An artificial neural network with 9 hidden layer neurons was established, which gave high q2 values after cross-validation. The parameters and establishment results of the model were as follows: learning rate = 0.01, momentum = 0.4, q2 = 0.64948, RMSE = 0.2677, and r = 0.84943. The learning rate controls the learning progress of the model, and momentum is used to update and optimize the weights. The results indicated that the developed model has favorable predictive ability.
3D-QSAR study
Currently, the most popular methods for three-dimensional quantitative structure-activity relationship (3D-QSAR) studies are comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA). The CoMFA method is a way of stacking molecules with the same parent ring structure in space and then adding a probe particle to calculate the molecule-particle interaction[26]. The different probe particles can detect various molecular fields around molecules, such as the hydrogen ion probe, which is able to detect electrostatic fields, and the water molecule probe, which is able to detect hydrophobic fields. CoMSIA is an improvement of CoMFA that defines the characteristics of multiple molecular fields, including steric fields, hydrophobic fields, hydrogen bonding fields and electrostatic fields[27].
Compound optimization of the training set was performed using SYBYL-X 2.0 software to select a suitable backbone structure for molecular stacking. Compound 63 [SMILES notation shown in Supplementary Table S2] with the best activity from the training set was selected and was used as a pattern to select the common backbone. The selected skeleton and the stacking results are shown in Fig. 4. The parameters related to the CoMFA and CoMSIA models are shown in Table 5 by the leave-one-out method to obtain favorable q2 and components values, where components is the number of principal components returned from the principal component analysis, q2 was used to verify the predictive ability of the model, SEE is the standard error of estimate, the F value was used to verify the significance of the model, and a larger r2 means better model fitting. The results showed that the established CoMFA and CoMSIA models have good predictive ability.
Table 5 Parameters and results from the CoMFA and CoMSIA models
Model
|
components
|
q2
|
SEE
|
F (p<0.05)
|
r2
|
CoMFA
|
5
|
0.626
|
0.234
|
69.184
|
0.801
|
CoMSIA
|
6
|
0.649
|
0.215
|
71.444
|
0.835
|
Instrumentation
All compounds were characterized by 1H-NMR, 13C-NMR and MS. 1H-NMR and 13C-NMR data were recorded by a Bruker Avance 600 (600 MHz) spectrometer (Brucker, Inc., Germany) with chloroform-d and CDCl3 as solvents and tetramethylsilane (TMS) as the internal standard, while MS data were acquired by a Thermo Fisher LCQ Fleet (ESI). The coupling constants (J) were measured in Hertz (Hz), and the signals were specified as follows: s represents singlet, d represents doublet, t represents triplet, m represents multiplet and br represents broad singlet. The optical density at 490 nm was measured by an enzyme-linked immunosorbent assay microplate reader (Fisher Scientific International, Inc., USA). Detection was performed with UV light irradiation (254 nm) and/or treatment with potassium bismuth iodide solution.
Materials
All chemicals and reagents used during the experiments were of analytical grade. Sophoridine (98.6%) was purchased from Shaanxi Undersun Biomedtech Co., Ltd. (China). cis-Dichlorodiamineplatinum(II) and camptothecin were purchased from Shanghai Aladdin Biochemical Technology Co., Ltd. (China). Methanol, ethyl alcohol, ethyl acetate, dichloromethane, trichloromethane, tetrahydrofuran, petroleum ether, methylbenzene, etc., were purchased from Xilong Scientific Co., Ltd. (China). Acetone, sodium chloride, etc., were purchased from Chengdu Chron Chemicals Co., Ltd. (China).
Synthesis of the sophoridine derivatives
Synthesis of Compound 1a
Compound 1a was synthesized as shown in Scheme 1. Sodium hydride (100 mml, 2.4 g) and tetrahydrofuran (60 ml) were added to a round bottom flask (150 ml), and sophoridine (5 mmol, 1.24 g) was added after stirring well. After the temperature was slowly increased to 80 °C, 4-ethoxy-benzaldehyde was added, and the reaction was carried out until the end point. The reaction solution was adjusted to neutral pH with hydrochloric acid (4 N) after cooling and then extracted with dichloromethane (30 ml 3). The organic layer was dried over anhydrous Na2SO4, and the filtrate was concentrated to yield a yellow oil substance. Compound 1a, a white solid, was purified by silica gel column chromatography (dichloromethane: methanol = 50:1, v/v).
Synthesis of Compounds 4-30
As shown in Scheme 1, sophoridinic acid potassium salt (Intermediate 2), which was generated by the action of the strong base of KOH, was prepared as a raw material for sophoridinic acid ester (Intermediate 3) with acyl chloride as the catalyst according to the method proposed by Chongwen Bi et al.[4].
Intermediate 3 (1 mmol, 0.2804 g) in chloroform (50 ml) was added to a round bottom flask (150 ml) and stirred at ambient temperature. Then, K2CO3 (5 mmol, 0.690 g) and 3-chlorobenzoyl chloride (2 mmol, 0.281 g) were added, and the reaction was slowly heated to 60 °C for reflux for 4 h. The reaction was followed to the end point by TLC, and the filtrate was concentrated to give a yellow oily liquid, which was purified by silica gel column chromatography and dried to yield a yellow oily substance (Compound 4, 12-N-(3-chlorobenzoyl) sophoridinic acid methyl ester) in 65% yield.
Compounds 5-30 were prepared according to the synthesis method of Compound 4. The physical characteristics and spectral data of the 28 new compounds are shown in the Supplementary Information.
Evaluation of anticancer activity
Human non-small cell lung cancer cells (A549), human nasopharyngeal carcinoma cells (CNE2), human hepatocellular carcinoma cells (HepG-2), and human endometrial cancer cells (HEC-1-B) were purchased from the American Type Culture Collection (ATCC).
Each drug was weighed and dissolved in DMSO to prepare a 1 M solution. Three concentrations (100 μM, 50 μM and 10 μM) were prepared with complete medium for primary screening. Drugs with IC50 <50 μM were selected for further screening at five concentrations (40 μM, 30 μM, 20 μM, 10 μM and 5 μM) after collation of the relevant data. The third screening consisting of five concentrations (100 μM, 90 μM, 80 μM, 70 μM, and 60 μM) was performed for the drugs with 50 μM < IC50 < 100 μM.
Cells were cultured in Dulbecco's modified Eagle’s medium (DMEM) containing 1% penicillin‒streptomycin and 10% fetal bovine serum (FBS) and then placed in a carbon dioxide incubator for 24 h. The cells were blown uniformly and inoculated in 96-well plates at a level of approximately 5,000 cells per well (the outermost 36 wells were filled with PBS without spreading cells). The cell culture plates were incubated overnight in a carbon dioxide incubator for 12 h, and the medium was replaced with different concentrations of drug-containing medium. Finally, MTT solution was added, and the OD value of each well was measured at 490 nm using a microplate reader after 2-3 h of treatment with DMSO reagent.
Cell Cycle Analysis
Human hepatocellular carcinoma cells (HepG-2) were treated with 10 μM, 20 μM and 30 μM Compound 26. A total of 5×105 cells were centrifuged at 1200 rpm for 5 min, followed by removal of the supernatant. Then, 1 ml of precooled 75% ethanol was added to fix the cells overnight at 4 °C protected from light. After the second centrifugation and removal of supernatant, PBS was added to wash the cells, and the third centrifugation and removal of supernatant was carried out. Then, 500 µl of PI/RNase dye was added to the samples, which were incubated for 15 min at ambient temperature and protected from light, and analysis using flow cytometry after 0.5 h was performed. All experiments were repeated twelve times.