3.1. Construction of training and test sets using Cluster Analysis
We selected a series of 53 compound antagonists of V2R to construct the training and test sets. All the selected molecules have the same core substructure Fig. 1, but show structure variability due to substituent structural diversity. All compounds have a common substructure (4-formamido-benzamide) remarked in a rectangle in the top panel of Fig. 1. The nitrogen of the benzamide in the common substructure (represented the partial charge q1) is part of a benzazepine, benzene-piperidine, or benzoxazine condensed ring. The R1 substituent is generally a ring except for compounds A03 and A07. We display in Table 1 the compounds we used in this study with PubChem ID and the experimental biological activities of V2R antagonists’ activity expressed as IC50 and pIC50, and the Supplementary Information Table S1 shows the values of all the calculated molecular descriptors.
Table 1: The experimental biological activities of V2R antagonists
ID
|
PUBCHEM_CID
|
IC50 (nM)
|
pIC50
|
A01
|
119369
|
12
|
7.92
|
A02
|
151171
|
11
|
7.96
|
A03
|
977621
|
7100
|
5.15
|
A04
|
2981363
|
420
|
6.38
|
A05
|
2981862
|
6500
|
5.19
|
A06
|
2984025
|
6400
|
5.19
|
A07
|
5099582
|
8100
|
5.09
|
A08
|
10499401
|
1000
|
6.00
|
A09
|
10501216
|
760
|
6.12
|
A10
|
10524202
|
1800
|
5.74
|
A11
|
10527129
|
150
|
6.82
|
A12
|
10527137
|
1000
|
6.00
|
A13
|
10548204
|
400
|
6.40
|
A14
|
10548205
|
790
|
6.10
|
A15
|
10548464
|
200
|
6.70
|
A16
|
10550481
|
190
|
6.72
|
A17
|
10574499
|
14
|
7.85
|
A18
|
10595449
|
4100
|
5.39
|
A19
|
10598596
|
29
|
7.54
|
A20
|
10599369
|
77
|
7.11
|
A21
|
10599903
|
96
|
7.02
|
A22
|
10619160
|
680
|
6.17
|
A23
|
10620180
|
110
|
6.96
|
A24
|
10621059
|
530
|
6.28
|
A25
|
10622282
|
250
|
6.60
|
A26
|
10642000
|
980
|
6.01
|
A27
|
10647295
|
25
|
7.60
|
A28
|
10666852
|
200
|
6.70
|
A29
|
10667727
|
210
|
6.68
|
A30
|
10668163
|
1900
|
5.72
|
A31
|
10690528
|
1400
|
5.85
|
A32
|
10692266
|
28
|
7.55
|
A33
|
10693776
|
44
|
7.36
|
A34
|
10713341
|
7600
|
5.12
|
A35
|
10716675
|
27
|
7.57
|
A36
|
10716676
|
24
|
7.62
|
A37
|
10741034
|
82
|
7.09
|
A38
|
10742994
|
13
|
7.89
|
A39
|
10743970
|
29
|
7.54
|
A40
|
10762667
|
400
|
6.40
|
A41
|
10762739
|
300
|
6.52
|
A42
|
10765617
|
50
|
7.30
|
A43
|
10766187
|
22
|
7.66
|
A44
|
10789935
|
58
|
7.24
|
A45
|
10810069
|
1200
|
5.92
|
A46
|
10810133
|
180
|
6.74
|
A47
|
10832492
|
1700
|
5.77
|
A48
|
10834036
|
2100
|
5.68
|
A49
|
10834761
|
170
|
6.77
|
A50
|
10837161
|
27
|
7.57
|
A51
|
10838492
|
20
|
7.70
|
A52
|
10838493
|
13
|
7.89
|
A53
|
11798122
|
71
|
7.15
|
To classify the molecules of the datasets, depending on their structural variability we performed a hierarchical cluster analysis, the resulting dendrogram was constructed using the Euclidean distance (x-axis) and the complete linkage (y-axis), illustrating the results of the k-NNCA developed in this dataset. The dendrogram shows 6 different subsets demonstrating the molecular variability among the compounds of this dataset (Fig. 2). To evaluate the output dendrogram and to split the whole dataset into training and test sets, we performed a k-mean cluster analysis (k-MCA) [55].
The selection of the training and test sets was carried out by randomly taking molecules belonging to each cluster. From the initial 53 compounds, 42 (80 % of the dataset) were chosen to form the training set and the remaining 11 compounds, (20 % of the dataset) were used as a test set for the external cross-validation of the model.
3.2. Development and validation of the QSAR model
GA combined with MLR is widely used for QSAR and QSPR studies [31–34]. In this method, a GA is performed to search the feature space and select the major descriptors relevant to the activities or properties of the compounds. This method can deal efficiently with a large search space, and it has fewer chances to only find a local optimal solution than other algorithms. GA is a well-estimated method for parameter selection and to overcome the shortages of MLR in variable selection. After a GA, the MLR is employed to correlate the selected descriptors with the activity values using a classic regression method to yield the explicit equations.
The variables selected by the genetic algorithm as the best model of V2R antagonist activity are shown in Eq. 5. To further validate the variables thus obtained, we performed an MLR analysis of the 43 compounds on the initial training set, with the 11 compound test set for the external cross-validation.
pIC50 = − 7.968 (± 3.584) q6 + 0.095 (± 0.059) EH2O + 0.161 (± 0.027) P − 5.842 (± 2.894) (5)
n = 43; R = 0.89; R2 = 0.80; s = 0.40; F = 53,61; p < 0.0001; q2 = 0.75
Test set:
n = 11; R = 0.86; R2 = 0.74; s = 0.41; F = 25.04; p < 0.0007; q2 = 0.56
The R2 (R-square statistic or coefficient of determination) indicate that the model could explain 80 % of the variance for the experimental values of pIC50. The model shows a q2 of 0.75. This value of more than 0.5 could be considered as proof of the high predictive ability of the model, along with the good prediction of the test set (R2 = 0.74). The good R2 and q2 values obtained in Eq. 5 for both training and test set can be explained with the experimental values for all the compounds of the series. The calculated values for pIC50, are highly similar to the experimental, sustaining the reliability of the QSAR model (Fig. 3, Table 2).
Table 2
Experimental and calculated values for the pIC50 of the dataset
IDa
|
Y(obs)b
|
Y(calc)c
|
Residuald
|
A01
|
7.92
|
7.41
|
0.51
|
A02
|
7.96
|
8.15
|
-0.19
|
A03
|
5.15
|
5.37
|
-0.22
|
A04
|
6.38
|
6.07
|
0.31
|
A05
|
5.19
|
5.60
|
-0.41
|
A06
|
5.19
|
5.33
|
-0.13
|
A07*
|
5.09
|
5.39
|
-0.29
|
A08
|
6.00
|
6.11
|
-0.11
|
A09
|
6.12
|
5.66
|
0.46
|
A10*
|
5.74
|
6.19
|
-0.44
|
A11
|
6.82
|
7.17
|
-0.35
|
A12
|
6.00
|
6.25
|
-0.25
|
A13
|
6.40
|
6.16
|
0.24
|
A14
|
6.10
|
6.13
|
-0.03
|
A15*
|
6.70
|
6.24
|
0.46
|
A16
|
6.72
|
6.96
|
-0.24
|
A16*
|
6.72
|
6.96
|
-0.24
|
A17
|
7.85
|
7.51
|
0.34
|
A18
|
5.39
|
5.79
|
-0.41
|
A19
|
7.54
|
7.29
|
0.25
|
A20*
|
7.11
|
7.34
|
-0.23
|
A21
|
7.02
|
7.15
|
-0.14
|
A22
|
6.17
|
6.31
|
-0.15
|
A23
|
6.96
|
6.18
|
0.78
|
A24*
|
6.28
|
5.52
|
0.76
|
A25
|
6.60
|
6.41
|
0.19
|
A26
|
6.01
|
5.90
|
0.11
|
A27
|
7.60
|
7.23
|
0.38
|
A28*
|
6.70
|
6.21
|
0.49
|
A29
|
6.68
|
6.63
|
0.05
|
A30
|
5.72
|
6.25
|
-0.53
|
A31
|
5.85
|
6.32
|
-0.47
|
A32
|
7.55
|
6.87
|
0.68
|
A33
|
7.36
|
7.52
|
-0.17
|
A34
|
5.12
|
5.63
|
-0.52
|
A35
|
7.57
|
7.10
|
0.47
|
A36
|
7.62
|
7.30
|
0.33
|
A37
|
7.09
|
6.42
|
0.67
|
A38
|
7.89
|
8.19
|
-0.30
|
A39
|
7.54
|
8.20
|
-0.66
|
A40
|
6.40
|
6.41
|
-0.01
|
A41*
|
6.52
|
6.18
|
0.35
|
A42*
|
7.30
|
7.64
|
-0.33
|
A43
|
7.66
|
8.08
|
-0.42
|
A44
|
7.24
|
7.06
|
0.18
|
A45
|
5.92
|
6.72
|
-0.79
|
A46
|
6.75
|
6.40
|
0.35
|
A47
|
5.77
|
5.75
|
0.02
|
A48
|
5.68
|
6.13
|
-0.45
|
A49*
|
6.77
|
6.78
|
-0.01
|
A50
|
7.57
|
7.27
|
0.30
|
A51
|
7.70
|
7.79
|
-0.09
|
A52*
|
7.89
|
7.62
|
0.27
|
A53
|
7.15
|
6.73
|
0.42
|
a ID of compounds in the study. Chemicals marked with an asterisk in the test set |
b Experimental values of the effective dose |
c Values calculated by Eq. 5 |
d Observed minus calculated values |
In the correlation study with the calculated descriptors, a low correlation was observed between the variables, indicating the reliable information content on each term in the equation (Table 3). The selected variables by the genetic algorithm were P (polarizability) EH2O (hydration energy), and q6 (partial charge of nitrogen in the common substructure 4-formamidobenzamide (Fig. 1)) (Table 4). For each one of the variables the coefficients were significant (Table 5), indicating their relative contribution to the combined prediction of the biological activity as the dependent variable. Calculating the value of the coefficients on the regression analysis we ensure a good prediction starting from the group of the independent variables (q6, EH2O and P), facilitating the interpretation of the independent influence of each variable on the final equation.
Table 3: Correlation Matrix of model variables
|
q6a
|
EH2Ob
|
Pc
|
q6
|
1
|
0.44
|
0.13
|
EH2O
|
0.44
|
1
|
0.05
|
P
|
0.13
|
0.05
|
1
|
a Partial charge on atom N7 of the common substructure of the studied compound
b Hydration energy
c Polarizability
Table 4
Quantum and physicochemical parameters values included in the QSAR model of 53 V2R antagonist in the dataset
ID
|
q6a
|
EH2Ob
|
Pc
|
A01
|
-0.73
|
-5.80
|
49.72
|
A02
|
-0.72
|
-10.78
|
57.51
|
A03
|
-0.73
|
-5.38
|
36.87
|
A04
|
-0.72
|
-7.59
|
42.96
|
A05
|
-0.74
|
-14.07
|
42.74
|
A06
|
-0.75
|
-10.51
|
38.19
|
A07
|
-0.73
|
-5.30
|
36.87
|
A08
|
-0.72
|
-7.72
|
42.86
|
A09
|
-0.74
|
-13.88
|
42.74
|
A10
|
-0.75
|
-9.74
|
43.50
|
A11
|
-0.73
|
-8.64
|
49.81
|
A12
|
-0.52
|
-3.44
|
51.56
|
A13
|
-0.74
|
-9.63
|
43.50
|
A14
|
-0.74
|
-10.11
|
43.50
|
A15
|
-0.74
|
-7.74
|
42.96
|
A16
|
-0.74
|
-10.63
|
49.24
|
A17
|
-0.74
|
-5.86
|
49.72
|
A18
|
-0.73
|
-9.32
|
41.67
|
A19
|
-0.73
|
-7.15
|
49.72
|
A20
|
-0.72
|
-7.00
|
50.36
|
A21
|
-0.73
|
-11.05
|
51.16
|
A22
|
-0.74
|
-6.94
|
42.86
|
A23
|
-0.73
|
-8.29
|
43.50
|
A24
|
-0.71
|
-12.19
|
42.74
|
A25
|
-0.72
|
-7.25
|
44.89
|
A26
|
-0.74
|
-8.15
|
41.03
|
A27
|
-0.73
|
-11.01
|
51.57
|
A28
|
-0.73
|
-6.94
|
42.86
|
A29
|
-0.73
|
-5.77
|
44.70
|
A30
|
-0.74
|
-7.79
|
42.96
|
A31
|
-0.74
|
-6.94
|
42.86
|
A32
|
-0.73
|
-6.15
|
46.53
|
A33
|
-0.74
|
-5.82
|
49.72
|
A34
|
-0.74
|
-9.86
|
40.32
|
A35
|
-0.74
|
-7.08
|
47.89
|
A36
|
-0.77
|
-7.44
|
47.89
|
A37
|
-0.72
|
-7.40
|
44.89
|
A38
|
-0.84
|
-13.07
|
52.99
|
A39
|
-0.73
|
-8.18
|
55.95
|
A40
|
-0.73
|
-7.10
|
44.22
|
A41
|
-0.73
|
-8.29
|
43.50
|
A42
|
-0.73
|
-6.61
|
51.56
|
A43
|
-0.73
|
-5.05
|
53.39
|
A44
|
-0.73
|
-10.05
|
50.03
|
A45
|
-0.74
|
-5.95
|
44.70
|
A46
|
-0.77
|
-10.96
|
44.22
|
A47
|
-0.74
|
-9.13
|
40.55
|
A48
|
-0.72
|
-8.19
|
43.50
|
A49
|
-0.73
|
-6.21
|
46.05
|
A50
|
-0.72
|
-6.53
|
49.81
|
A51
|
-0.74
|
-6.31
|
51.74
|
A52
|
-0.72
|
-6.18
|
51.74
|
A53
|
-0.71
|
-11.09
|
49.60
|
a Partial charge on atom N7 of the common substructure of the studied compounds |
b Hydration energy |
c Polarizability |
Table 5: Coefficient Analysis
Predictor
|
Coef. a
|
Stdev b
|
95% Conf. c
|
t-ratio d
|
p e
|
Constant
|
-5.84
|
1.42
|
2.89
|
-4.12
|
0.0002
|
q6
|
-7.97
|
1.76
|
3.58
|
-4.54
|
0.0001
|
EH2O
|
0.10
|
0.03
|
0.06
|
3.28
|
0.0022
|
P
|
0.16
|
0.01
|
0.03
|
12.15
|
0.0000
|
a Constant and coefficients of the model variables
b Standard deviation
c Confidence Interval
d Estimate divided by standard error
e Level of significance
The variable q6 represents the partial charge of the N7 in the 4-formamidobenzamide common substructure, involved on an amide bond associated with the variable zone of the compounds. The partial charge q6 is the most negative, with the highest module value for all the studied charges, having a fully negative value range and a negative coefficient on Eq. 5, could indicate the favorable tendency of an increased antagonist activity with more negative values of q6. It could be explained by the fact that N7 is involved in a hydrogen bond, or because it just reflects the variation of partial charge depending on the nature of substituent R1. The partial charge q6, calculated only for the common substructure is -0.712 and the substituent on R1 is making it more negative, except for compounds A53, A24, and A12. In the case of A12, the N7 is substituted by a methyl, which has a strong inductive effect (+ i) over the nitrogen. Compounds A53 and A24 both have a benzene ring with a nitro substituting in the ortho position as R1. The common substructure in most compounds is formed entirely by a conjugated system with a substituted benzene ring as R1, which might contribute to a whole conjugated system. If we compare the compounds by the position of the substituent on the benzene ring at R1, we observe a lower partial charge on q6 associated with an ortho substituent, calling for a combination of steric and electronic factors, where the ortho substituent could break the conjugation planarity.
Hydration energy (EH2O) is the amount of energy released when one mole of a compound is hydrated, and represents the measure of the water molecules affinity for the compound. More negative hydration energy values could be associated with more polar groups in the compound, and less negative hydration energy could be attributed to the presence of a higher number of nonpolar groups [56]. The range of values for this variable in the data set is negative and it has a positive coefficient on Eq. 5, thus indicating that more negative values of hydration energy are unfavorable for the antagonist activity, suggesting a binding site with possible hydrophobic interactions, as more hydrophilic compounds are shown unfavorable for the activity.
Polarizability refers to the tendency of any compound to acquire an electric dipole moment in proportion to an applied electric field, on our model the Polarizability component is having a correlation coefficient of 0.7 with the pIC50, making it the descriptor with the highest correlation to the activity, being the other variables the fine adjustments necessary to improve the model in general. The range of polarizability values in the data set is positive and has a positive coefficient in Eq. 5, so it has a favorable contribution to antagonist activity.
As mentioned above, the compounds in the studied dataset have conjugated systems in their structure, and systems with delocalized π electrons exhibit high polarizabilities. The aromatic systems’ planarity with their high polarizability and multipole moment, are all factors of key importance for the 3D architecture of aromatic complexes [57]. Soft interactions like dispersion, are predominant in stacking and can be estimated from the polarizability [58]. Another possible interaction for the polarizable π-electron cloud of aromatic rings is with cations, and polarizability is also relevant for this π-cation interaction [59, 60]. In general, the studied compounds have three aromatic rings in their structure that could be involved in π-interactions with the residues in the receptor-binding site.
In general, the obtained QSAR model provides indications that the binding mode of V2R antagonists might fundamentally be involving hydrophobic and electron density interactions.
3.3. The applicability domain (AD) of the QSAR model
A QSAR model needs to show not only a good accuracy, but also some reliability for predictions of new compounds, in general these models cannot be universal and should be constrained to a defined chemical space, commonly known as the applicability domain (AD). The AD can be described as the physicochemical, structural or biological spatial information based on which the model training set is developed. The QSAR model applies to make predictions for new compounds within the specific domain [57]; in summary, the AD is the degree to which a QSAR model tolerates (reliably) new compounds.
A crucial problem in chemometrics and QSAR studies is the definition of the AD with a regression model. We will define it here as a squared area within ± 2 bands for standardized residuals and a leverage threshold of h = 0.23 for inhibitory activity (Eq. 5). Thus, compounds with standardized residuals greater than 2 standard deviations will be considered unreliable. For the graphical visualization of outliers for the response (standardized residuals > 2) or for the structure (leverage > 0.23) in the regression model, the Williams plot for Eq. 5 is shown in Fig. 4. Of the 53 compounds in the dataset, only two compounds (A02 and A12) have a leverage higher than the critical value.
A02 (conivaptan) has the highest value of polarizability (57.51) of the dataset, while the other compounds are between 36.87 to 55.95. A02 shows a diphenyl moiety as substituent of the amide in the common substructure, while the other compounds exhibit only a single aromatic ring or an aliphatic substituent, A02 also have a condensed 3 ring system of 3,4,5,6-tetrahydroimidazol[4,5-d][1]benzazepine, while the other compounds have only a 2 ring system. The presence of extra rings on A02, might account for the increase on the polarizability for this compound with a different electronic structure than the rest.
A12 shows the highest value of q6 (-0.52) of the dataset, while the other compounds are between of -0.71 to -0.84, and it also has the highest value of hydration energy (EH2O), with − 3.44 kcal/mol, while the other compounds are between − 5.05 to -14.07 kcal/mol. Compound A12 show a minor difference in the common substructure with all the antagonists of this family having a methyl group as substituent for the N amide, directly altering the partial charge (q6) of N7 on the 4-formamidobenzamide and increasing the hydration energy value. The structure of A12 and A01 differ only in the aforementioned methyl group, A12 has a difference of 0.20 on q6, and of 2.36 kcal/mol in the value of hydration energy compared to A01, evidencing the influence that this single methyl group can have.
3.4. V2R modeling
We selected the human OX2 orexin receptor (PDB ID 4S0V) as the template for V2R modeling, as suggested by the GPCRdb template tool. The selected template has 27 % of identity and 46 % of similarity with V2R. In the corresponding alignment, the fragments corresponding to transmembrane helices and the conserved motifs are preserved (Fig. 5). In both the OX2 receptor and V2R, the natural ligand is a peptide, and the selected structure has an antagonist bound being on an inactive conformation, suitable to study the binding modes of antagonists to the V2 receptor.
To relax the obtained model in a more natural environment, it was minimized and equilibrated for 10ns in a POPC membrane, solvated and with NaCl added at physiological concentration (0.15 M). At the end of the simulation the membrane parameters, like thicknesses and per lipid area, were calculated to check the correct packing. The membrane thickness (distances between phosphates of each monolayer) was 38.91 Å, and the per lipid area was 64.54 Å2. These parameters are reasonable for a POPC membrane at 310 K, according to experimental parameters obtained at different temperatures [62].
During the current work, another X-ray structure from the class A of GPCR was released in the Protein Data Bank (PDB ID 6TPK): the oxytocin receptor (OXTR) which is also a member of the vasopressin receptor family. Although OXTR exhibits slightly better sequence identity and similarity: 41 % and 56 % respectively, its lower resolution (3.20 Å versus 2.50 Å), a missing region (loop and helix 8), and a shorter loop (ICL3) make it a less relevant template. However, in an effort for further validation, the V2R model was compared to OXTR; the superimposition between the V2R model and OXTR is shown in Fig. 6, where we considered the two conformations of V2R before and after membrane relaxation.
The RMSD values, using OXTR as reference and comparing the model before and after the membrane relaxation were 1.02 Å and 1.15 Å respectively. The main differences between OXTR and the models were that OXTR lacks the ICL3, a long intracellular loop involved in the interaction with the G-protein usually missing in GPCR solved structures and the helix 8, parallel to the membrane and useful for orienting the receptor in the membrane. Comparing the bundled helices, the main difference relative to the binding site is that the TM2 of relaxed model in the membrane is in the same position as that of the OXTR, while the model before relaxing has the TM2 slightly tilted towards the interior of the cavity decreasing its volume. The difference in the orientation of TM2 in the model before relaxation could be caused by the difference in the proline position in this transmembrane section between V2R and the template (Fig. 5). Proline in the middle of alpha helices cause a kink, by being unable to complete the H-bonding chain of the helix and because of steric and/or rotameric effects keeping it out from the preferred helical geometry [63]. Proline one position earlier on the sequence of the template TM2 with respect to that of the model, can affect the orientation of the kink and result on a different orientation in the model. This odd orientation could later be corrected during the relaxation in the membrane showing this process as very favorable.
The comparison between the V2R model after relaxation and OXTR brings more confidence in the model’s quality and the protocol used for relaxation.
3.5. V2R-antagonist complexes
Visual inspection of the binding site revealed that the side chains of residues W99 and F307 are occluding the entrance of the binding site, therefore these two residues were considered as flexible for the molecular docking. To improve docking results other residues of the binding site (Q96, F105 and K116) were also considered as flexible.
The antagonists selected as ligands for the molecular docking were: mozavaptan (A01), conivaptan (A02), and tolvaptan. Tolvaptan also shares the common substructure of the studied compound series for the QSAR model and it is the only drug approved to treat Polycystic Kidney Disease. All the rotatable bonds of the ligands were flexible.
The 50 complexes for each antagonist obtained by molecular docking were clustered for analysis. Three orientations of the antagonists in the binding site were identified and shown in Fig. 7. The first with the condensed ring of the antagonist toward TM2 and TM7 (CR-27), the second with the condensed ring towards TM5 and TM6 (CR-56) and the last one with the condensed ring towards the entrance of the cavity (CR-UP). The binding energies of the complexes obtained for the three identified conformations differ by less than 1 kcal/mol, in each of the antagonists studied. This difference in the energy value is lower than the standard error of the Autodock Vina scoring function [45], so we might need more studies to select the best conformation.
We expect a good antagonist to bind with high affinity to the receptor binding site but failing to activate it, blocking the access of any agonist to the binding site. A study with meta- dynamics enhanced sampling revealed the existence of three binding sub-sites for V2R, proposed to respond to the vasopressin entry pathway [64]. The compounds that bound in both the vestibule and the intermediate sites block the access to the orthosteric site so that, an agonist will never be able to bind, if there is an antagonist already bound to any of the non-activating sites. Two of the antagonists studied by Saleh et al [64], with high structural similarity to those in this study, were predicted to bind to vestibule site for V2R and intermediate site for V1aR, so it is to be expected that the antagonists in our study are located in one of these sites. Therefore, we eliminated from our subsequent analysis the CR-UP conformation, showing the antagonist penetrating deeper into the cavity with part of it located in the orthosteric site.
To study the stability of the compounds in the binding site and make a better estimation of the binding energy, a molecular dynamic simulation of the best complex of the two remaining orientation was performed. Figure 8 show the RMSD for the two different conformations of the three studied antagonists. The conformations of the studied antagonists tend to stabilize along the molecular dynamics simulations, being CR-27 the conformation with lower RMSD for each of the antagonist. For mozavaptan, the conformation CR-56 is the conformation with more fluctuations along the trajectory and it has the highest RMSD value among all antagonist conformations. The change in the antagonist’s conformation for the six complexes on the molecular dynamics simulation is show in Fig. 9. The two-representative conformation for each antagonist are represented from left to right (CR-27 and CR-56 respectively). Tolvaptan is represented in green, conivaptan in blue and mozavaptan in pink. The starting conformation at 0 ns is represented by the light colored ligand and the final conformation at 50 ns by the dark colored ligand. Mozavaptan CR-56 conformation showed a significant change in the orientation of the antagonist with respect to the starting conformation, also reflected in the high RMSD value observed for this conformation. For this reason, this conformation was eliminated from the subsequent analysis.
In order to predict which binding conformation could be the best for each antagonist, we estimated the binding free energy variation using LIE-D method. This method is flexible enough to consider different interaction patterns even though the ligands share some common chemical scaffolds and are bound to the same protein receptor [53].
For tolvaptan, the conformation with the best binding free energy is tolvaptan CR-27 (-14.34 kcal/mol) with over 2 kcal/mol of difference with CR-56, that was thus discarded from further analysis. The conformations of conivaptan have similar energies between them, so it is not possible to select which of the two conformations is the best using this criterion. The inhibition constant (Ki) was calculated from the estimated binding free energy for these conformations and compared with experimental values (Table 6). The values of the calculated Ki follow the same trend than the experimental inhibition constants [65, 66], except for the tolvaptan, with the Ki for conformation CR-27 one order lower than the experimental value and the Ki conformation CR-56 one order higher, suggesting as expected that CR-27 was the best conformation for tolvaptan.
Table 6: Antagonist-V2R estimated binding free energy (ΔG(kcal/mol)) by LIE-D method and estimated inhibition constant (Ki)
Antagonist
|
ΔG(kcal/mol)
|
Ki (nM)
|
CR_27
|
CR_56
|
CR_27
|
CR_56
|
experimental
|
conivaptan
|
-12.94
|
-12.98
|
0.70
|
0.66
|
0.36 [65]
|
mozavaptan
|
-11.60
|
-
|
6.17
|
-
|
9.42 [66]
|
tolvaptan
|
-14.23
|
-11.56
|
0.09
|
6.65
|
0.43 [66]
|
The binding free energy and Ki values obtained for the different conformations of the antagonists bound to V2R allowed us to select the CR-27 conformations (or the CR-56 conformation for conivaptan only) as the possible binding modes of these antagonists to V2R. From these conformations, the analysis of the main antagonist-receptor complex interactions can be carried out.
3.6. Interactions analysis of the best antagonist-V2R complexes
The antagonists must block the access or interact with those residues favoring the union of any agonist for the receptor activation, and/or also sterically blocking the residues involved in triggering the activation mechanism. The most relevant contacts between the antagonist and the receptor are summarized in Table 7. The interactions observed for the mozavaptan and tolvaptan complexes are very similar, while conivaptan interacts with a greater number of residues, since it is a compound with a greater volume than those mentioned above.
Table 7: Most relevant contacts for the Antagonist-Receptor complexes. The numbers represent the percent of frames with the contact present in the trajectory. Only those contacts that were observed in more than 50% of the trajectory are represented with numbers.
V2R Residue
|
|
Conformations
|
|
MVP-CR -27
|
CVP-CR-27
|
CVP-CR-56
|
TVP-CR-27
|
Q92
|
-
|
97.6
|
-
|
-
|
Q96
|
95.0
|
91.3
|
97.4
|
99.9
|
W99
|
99.1
|
76.8
|
-
|
97.8
|
F105
|
95.0
|
54.4
|
70.6
|
98.0
|
K116
|
98.1
|
64.9
|
96.9
|
99.5
|
Q119
|
-
|
59.2
|
96.8
|
-
|
F178
|
87.2
|
-
|
-
|
92.2
|
C192
|
99.3
|
-
|
-
|
100
|
A194
|
93.2
|
95.9
|
59.1
|
94.6
|
Y205
|
-
|
93.1
|
-
|
-
|
V206
|
-
|
93.6
|
86.1
|
-
|
Q291
|
-
|
63.3
|
99.0
|
-
|
F307
|
97.9
|
90.9
|
98.3
|
96.5
|
L310
|
-
|
68.0
|
82.1
|
-
|
M311
|
-
|
61.4
|
87.1
|
87.5
|
In the analysis of the obtained complexes, there are some common interactions involving the hydrophobic (C192, A194, L310, M311), aromatic (W99, F105, F178, F307) and polar (Q96, K116, Q119, Q291) residues. The residues Q96, Q119, Q291, K116 are highly conserved in all the AVP and OXT receptors’ family, and are known to have a key role in agonist binding [67]. Previous studies with V1aR suggest that the residues Q96, Q119, Q291 are K116 are specifically involved with the ligand binding process but do not intrinsically modulate the efficacy of the functional response [67]. An analysis of the presence of H-bonds along the molecular dynamic simulation was performed to study the nature of the interactions between the antagonists and these polar residues of the receptor. The percentage of frames of the trajectory with a determined number of H-bonds is shown in the Table 8. In all the studied conformations the occurrence of H-bonds is low, approximately between 13 and 38%, although interactions with some polar residues are observed in greater percentages of the trajectory (Table 7), which suggests other kind of interactions. In the case of K116, the positively charged ε-amino group can also interact with aromatic rings. This π-cation interaction was detected between K116 and a ring of the studied antagonists (Fig. 10 and Supplementary Information Figures S1-S3).
Table 8: Analysis of H-bond between antagonist-V2R H-bonds of the studied conformations. The numbers represent the occurrence percentage of H-bonds along the molecular dynamic simulations
No. H-bonds
|
Conformations
|
MVP-CR27
|
CVP-CR27
|
CVP-CR56
|
TVP-CR27
|
TVP-CR56
|
0
|
76.12
|
63.68
|
61.72
|
85.50
|
59.86
|
1
|
22.46
|
31.28
|
35.60
|
13.64
|
33.30
|
2
|
1.34
|
4.74
|
2.66
|
0.84
|
6.40
|
3
|
0.08
|
0.30
|
0.02
|
0.02
|
0.44
|
The antagonists interact with the aromatic residues W99, F105, F178 and F307. These residues are not directly involved in the receptor activation, and they are near or at the entrance of the binding site, suggesting that the antagonist binding site might not be located deep into the cavity. The fact that the antagonists interact with residues near the entrance of the cavity agrees with the binding sites predicted by Saleh et al [64].
Some of these residues have been shown to be involved in vasopressin binding. W99 plays a fundamental role in stabilizing the vasopressin/receptor interactions responsible for the high-affinity binding of agonists to the V2 receptor and receptor selectivity. A mutation of W99 (W99R) greatly impaired the binding properties of the receptor and had a minor effect on its intracellular routing [68]. Other important residue for AVP binding is F105, the mutation F105V was reported to show cell surface expression and a maximal AVP-induced cAMP formation (Vmax) comparable to the wild type, but with a reduced ligand binding ability [69, 70].
An interesting interaction for the V2R antagonists is with F307, a non-conserved residue in vasopressin/oxytoxin family since V1aR has a threonine in this position. The relevance of this interaction is because some antagonists could bind to both V2R and V1aR due to the similarity of its binding site, but the interaction with F307 would be unique for V2R making it attractive for the design of antagonists having less selectivity for V1aR.
Other antagonist-receptor interactions found were with residues C192, A194 and M311. While C192 and A194 are conserved among the entire family, M311 is not, seeming to cooperate in the selective binding of some antagonists [71]. The M311V mutation in the TM7 of V2R has impaired the ligand capacity and binding [72] suggesting that in V2R the residue M311 could take part in the binding of peptide agonists [70, 73]
Taking into account the interactions and the estimated binding free energy, we considered CR-27 conformation (and CR-56 for conivaptan) as the best for the three antagonists studied by molecular dynamics. Figure 10 shows the CR-27 conformation of tolvaptan in complex with V2R.
In general, the main interactions observed here are involved in the binding of ligands to V2R, but are not involved on the receptor activation, which suggest that the studied conformations of the antagonists can block the binding of agonists and unable to activate the receptor. The presence of few H-bonds with polar residues and the other interactions observed are with aromatic residues and non-polar residues suggest that the main antagonist-receptor interactions are mostly hydrophobic in nature and could involve π-clouds.