3.1 Genomes of Brassicaceae retain divergent SVP homeologous copies
Copy and sequence level variation was characterised among Brassicaceae SVP homologs and to delineate divergence between B. juncea SVP homeologs, 50 sequences from 25 Brassicaceae species, including 6 unannotated sequences corresponding to B. juncea var. tumida retrieved from BRAD viz. Bju_A04_SVP (A04: 12160962..12163530), Bju_A09_1_SVP (A09: 49119658..49122837), Bju_A09_2_SVP (A09: 49449411..49452590), Bju_B01_SVP (B01: 19438290..19440691) and Bju_B02_SVP (B02: 6832951..6835995), and 5 cDNA sequence isolated from B. juncea cv. Varuna (BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP, BjuVAR4_SVP and BjuVAR5_SVP). A list of these homologs is provided (Supplementary Table 3). Details on genomic coordinates, progenitor and sub-genome of origin are also provided along with proposed nomenclature. To understand the extent of variation among SVP homologs from Brassicaceae, CDS and protein sequences were surveyed for size and pairwise percentage identity. The size of the SVP ranged from 1582 (Bra_BraA04g016520.3C) to 5870 bp (Bju_BjuB024255), while the proteins varied from 72 (Bju_B02_SVP) to 406 amino acids (Bju_BjuB024255) (Supplementary Table 3). Pairwise percentage identity at nucleotide level of 50 SVP Brassicaceae homologs ranged from 63 to 100%. Pairwise percentage identity at protein level ranged from 65 to 100% (Supplementary Table 4a and 4b). Extremely low values for pairwise percentage identity (2.6% and 1.2%, at the CDS, and protein level, respectively) were attributable to size variation in SVP homologs. Among the SVP homologs from B. juncea, the pairwise percentage identities ranged from 15.9 to 100% and 15.5 to 100% at CDS and protein level, respectively. The nucleotide sequence was found to be identical for 2 tandemly organised SVP copies on chromosome A09 viz. Bju_A09_1_SVP and Bju_A09_2_SVP were found to be identical at nucleotide level. In general, size and sequence variation were observed in SVP proteins across Brassicaceae. In B. juncea, as many as 9 SVP homologs were identified which differed in both size and sequence suggesting variation in predicted protein models.
3.2 Phylogenetic reconstruction establish genome of origin identities of SVP variants
To determine sub-genome of origin and homeolog-based divergence, phylogenetic reconstruction was carried out using 50 SVP protein sequences (Fig. 2). Overall, SVP homologs formed clusters based on species of origin followed by sub-genome (LF, MF1 and MF2) with phylogeny recapitulating established species relationships.
Within Brassica specific clades, the homeologs were found to cluster as per their genome and sub-genome of origin suggesting triplicated nature of meso- and allopolyploid species. Within the homeolog specific clades, sequences diverged according to progenitor genomes (AA, BB and CC). Evidently, 2 prominent clades (I and II) can be discerned in the phylogram. Clade I comprise of SVP sequences derived from non-Brassica species viz. Arabidopsis thaliana (Ath), Arabidopsis helleri (Aha), Arabidopsis lyrata (Aly), Descurainia Sophia (Dso), Boechera retrofracta (BOERET), Boechera stricta (BOESTR), Capsella grandiflora (Cgr), Capsella rubella (Cru) and Camelina sativa (Csa). In contrast, Clade II, primarily represents SVP homologs from Brassica species within sub-clade IIB. Assignment of sub genome and homeolog affiliation of Bju_A04_SVP, Bju_A09_1_SVP, Bju_A09_2_SVP, Bju_B01_SVP, Bju_B02_SVP, BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP, BjuVAR4_SVP and BjuVAR5_SVP from B. juncea based on B. rapa, B. oleracea and B. napus is provided in Supplementary Table 3. Since the size of Bju_B02_SVP (marked as O) was exceptionally small (72 aa), sub-genome identity could not be interpreted. Interestingly, in majority of species with well-characterised sub-genome structures, only LF and MF1 specific homeologs were detected. The MF2 specific homeolog was identified exclusively in C. sativa. In summary, the phylogenetic reconstruction of Brassicaceae SVP sequences led to unambiguous assignment of sub-genome of origin of nine B. juncea SVP sequences.
Detailed sequence analysis of 50 SVP homologs from Brassicaceae with respect to natural variation in gene structure, gene polymorphism and presence or absence of domains showed considerable structural variation in SVP proteins. Analysis of nucleotide polymorphisms as Pi values (DnaSP) across 50 SVP homologs (Supplementary Fig. 1) revealed lower Pi values in regions corresponding to exons. Analysis of exon-intron splicing patterns, using nine exons annotated in A. thaliana SVP as reference, revealed marked variation across 50 SVP homologs (Fig. 3). The exon-intron structures of Brassicaceae SVP homologs mainly varied with respect to differential presence of exons and introns (Fig. 3). For instance, the 1st intron is absent in Aar_AA_scaffold2954_20, Bna_ZS11A04G016220, Bra_BraA04g016520.3C and Tha_Thhalv10000342m, while the 7th and 8th exon are absent from Lal_ LA_scaffold3554_2. mRNA1 and Aar_AA_scaffold2954_20, respectively. The 8th and 9th exons appear particularly variable across variants.
Analysis of a total of 9 SVP cDNA sequences isolated from B. juncea also led to detection of splice variants. Instances of exon skipping, alternative donor splicing, and partial intron retention were detected in SVP sequences isolated from B. juncea cv. Varuna (Fig. 4); exon skipping was observed in BjuVAR1_SVP and BjuVAR2_SVP, while alternative donor (AD) along with exon skipping was observed in BjuVAR3_SVP and BjuVAR5_SVP. Interestingly, Bju_B02_SVP represented a much smaller variant (219 bp) comprising of 1st exon along with partially retained intron. These data point to an interplay of complex interaction patterns among SOC1 promoters and SVP proteins in B. juncea with splice-forms contributing additionally to existing homolog complexity.
Analysis of conserved domains (Supplementary Table 5) for SVP protein homologs derived from Brassicaceae (Supplementary Figs. 2 and 3) revealed differential presence of MIKC domains. Pairwise percentage identity in the DNA binding MADS domain of SVP homologs ranged from 89.4–100% (Supplementary Table 6). Inclusion of SVP sequences encoding truncated or unusually large sized proteins increased the range as 73.6–100%. Based on the differences in MIKC domain, 3 size classes of SVP proteins were identified. These included full length forms with intact or partial MIKC domains, and other forms with skipped or duplicated I, K and C domains. The SVP protein encoded by Aar_AA_scaffold2954_20, Bna_ZS11A04G016220, Bra_BraA04g016520.3C and Tha_Thhalv10000342m lacks MADS domain correlating to a skipped first exon. Among the 6 SVP copies from B. juncea, Bju_B02_SVP (ChrB02: 6832951.. 6835995) lacks a K-domain. On the other hand, Bju_BjuB024255 contains 2 K-domains. In summary, B. juncea harbours highly diverse variants of SVP proteins with potentially altered binding potential with SOC1 promoter homeologs.
3.3. SVP proteins reveal substantial sequence-based structural variation
Models were generated for a total of 15 SVP proteins, including 9 natural and 6 hypothetical proteins. Primarily, refined models were generated for 5 SVP homologs from B. juncea var. tumida (Bju_A04_SVP, Bju_A09_SVP, Bju_B01_SVP, Bju_B02_SVP and Bju_BjuB024255) using ab-initio and threading based approach, which were then employed as templates to generate models for the B. juncea cv. Varuna derived SVP (BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP and BjuVAR4_SVP) and hypothetical SVP proteins (Hyp1_SVP, Hyp2_SVP, Hyp3_SVP, Hyp4_SVP, Hyp5_SVP and Hyp6_SVP) based on percent sequence identity. Hyp1_SVP sequence was generated from natural protein Bju_B02_SVP which harbours a MADS domain of 59 amino acids. The Hyp1_SVP (60 amino acids) possess a MADS domain identical to Bju_B02_SVP, except for deleted RLGLGT (position 61 to 66). The Hyp2_SVP (93 amino acids) represents an intact M domain (72 amino acids) but is devoid of K- and C- domains. Similarly, Hyp3_SVP (77 amino acids) represents an intact M domain (72 amino acids) but misses I-, K- and C-domain. Hyp4_SVP (96 amino acids) represents an intact M domain (72 amino acids) and I-domain with only 3 bases from K-domain. Hyp4_SVP is devoid of K- and C-domains and Hyp5_SVP (171 amino acids) possesses an intact M-, I- and K- domains, but misses a C-domain. Finally, Hyp6_SVP represents an intact M domain (72 amino acids) but is devoid of an I-domain. BjuVAR1_SVP, BjuVAR2_SVP and BjuVAR3_SVP exemplify splice variants from B. juncea cv. Varuna.
Protein models generated for 5 SVP sequences retrieved from BRAD are shown in Fig. 5. The sequence details are provided in Supplementary Table 3 (highlighted in yellow). The protein models were iteratively refined, energy minimised and cross-validated for stable secondary structure conformations and stereochemical parameters (Supplementary Table 7). The refined models of these SVP proteins depicted characteristic structural features including a loop region, followed by an α-helix (α1), 2 β-strands (β1 and β2) and α-helices (α2, α3 and α4) as shown in magnified illustration of the models generated for Bju_A09_SVP (Fig. 5a). Such prominent structural features are clearly visible in Bju_A04_SVP, Bju_A09_SVP, and Bju_B01_SVP, and Bju_BjuB024255 (Fig. 5b), with minor variations. Furthermore, the significantly distinct model of Bju_B02_SVP may be attributed to absence of α3 and α4 helices, suggesting plausible role in DNA binding. The detailed structural features of MADS domain of the 5 proteins are depicted in Fig. 5c. As expected, the SVP proteins from B. juncea var. tumida (BRAD) represent structural features characteristic of MEF-2A class of MADS-box superfamily.
Using refined models of SVP proteins (Fig. 5) as templates, homology based modeling was undertaken to generate models for 4 B. juncea cv. Varuna specific SVP proteins isolated in the present study (BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP and BjuVAR4_SVP) (Supplementary Fig. 4). Identification of highest pair-wise identity values between B. juncea var. tumida in-silico translated SVP sequences and B. juncea cv. Varuna sequences formed the basis of template selection (Supplementary Table 4b). The refined model of Bju_B01_SVP was used as template for generating the models of BjuVAR3_SVP, while Bju_A09_SVP was employed as template for modelling BjuVAR1_SVP, BjuVAR2_SVP and BjuVAR4_SVP. The models generated for B. juncea cv. Varuna SVP proteins are represented in Supplementary Fig. 4a and the quality parameters of all generated protein models are provided in Supplementary Table 7. The prominent structural features represented in SVP proteins derived from B. juncea var. tumida (Fig. 5) were also apparent in BjuVAR3_SVP, BjuVAR4_SVP and BjuVAR5_SVP. Absence of α3 helix from SVP isoform BjuVAR1_SVP corresponds to a missing K-domain. The magnified view of secondary structural aspects of MADS domain of proteins is provided in Supplementary Fig. 4b.
To delineate the influence of individual domains (M-, I-, K-, and C-) on overall DNA–protein interactions, 6 hypothetical proteins (Hyp1_SVP, Hyp2_SVP, Hyp3_SVP, Hyp4_SVP, Hyp5_SVP, and Hyp6_SVP), representing differential presence of M-, I-, K-, and C- domains, were additionally modelled using SWISS-MODEL employing Bju_A09_SVP as a template. The sequences of hypothetical proteins along with their sizes and proposed nomenclature are provided in Supplementary Table 8 and the models generated along with the assesment scores are given in Supplementary Fig. 5a and Supplementary Table 7, respectively. The MADS domains of these modelled hypothetical SVP proteins are depicted in Supplementary Fig. 5b. The criteria employed to design the hypothetical proteins is demonstrated in Supplementary Fig. 5c. The newly modelled proteins were markedly similar to respective template. Models of BjuVAR1_SVP, BjuVAR2_SVP and BjuVAR4_SVP and 6 hypothetical SVP proteins were nearly similar Bju_A09_SVP template. Similarly, the model of BjuVAR3_SVP was similar to the template Bju_B01_SVP. Overall, a considerable degree of sequence dependent natural structural variation was observed for B.juncea SVP proteins.
3.4. Nucleic acid models of homeologs of B. juncea SOC1 promoter fragments are structurally diverse
DNA models were generated using 3D-DART server for B. juncea SOC1 promoter homeologs harbouring SVP binding site. The fragment size reckoned for BjupSOC1_AALF and BjupSOC1_AAMF1 was 30bp and harboured a 10bp SVP binding site. For BjupSOC1_AAMF2, a 31bp promoter fragment containing 11 bp SVP binding site was considered for generating models. The models visualized using Chimera, revealed major (Ma) and minor grooves (Mi) on the B-form double stranded DNA models (Fig. 6a). A 2D linear representation of BjupSOC1_AALF, BjupSOC1_AAMF1 and BjupSOC1_AAMF2 generated using DNAproDB is provided in Fig. 6b, to simultaneously depict the sequence and arrangement of nucleotides. The nucleotide positions are marked with arrows in the anti-parallel strands of DNA. For instance, the nucleotides are numbered as 1–30 and 31–60 on the plus and minus strand, respectively, of BjupSOC1_AALF promoter fragment. Since the TFBS for SVP are present on the minus strands of promoter fragments, the nucleotides at position 11–20 and 11–21 correspond to SVP binding motifs in BjupSOC1_AALF/BjupSOC1_AAMF1 and BjupSOC1_AAMF2, respectively. The TFBS in each promoter homeolog is highlighted with blue background (Fig. 6b) and the corresponding regions are coloured in cyan in the 3D illustration (Fig. 6a). Analysis of 3D DNA models revealed considerable natural variation in several molecular parameters. These mainly included nucleotide position-wise information on H-bond length (Å) among atom pairs, local base-pair parameters viz. shift, slide, rise, tilt, roll and twist, major and minor groove width among others. Supplementary Table 9–11 provides detailed information on structural variability observed across DNA models of BjupSOC1_AALF, BjupSOC1_AAMF1 and BjupSOC1_AAMF2. The variation across base-pair and other atomic pair parameters were found interspersed across nucleotide positions 1 to 30 in case of BjupSOC1_AALF/BjupSOC1_AAMF1 and 1 to 31 in case of and BjupSOC1_AAMF2. Detailed investigation revealed considerable variability in nucleotide position 15, 16, 17 and 18.
3.5. Comparable binding affinity of docked complexes despite natural structural variation in SVP proteins and SOC1 promoter homeologs
To examine if structural variation exhibited in models of SVP proteins and SOC1 promoter fragments harbouring variable SVP binding sites influenced binding affinity between regulatory pairs, docking studies were undertaken. In total, 45 bimolecular interactions among 3 promoters (BjupSOC1_AALF, BjupSOC1_AAMF1 and BjupSOC1_AAMF2) and 15 SVP proteins were analysed. These included 9 natural proteins (Bju_A04_SVP, Bju_A09_SVP, Bju_B01_SVP, Bju_B02_SVP Bju_BjuB024255, BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP and BjuVAR4_SVP) and 6 hypothetical sequences (Hyp1_SVP, Hyp2_SVP, Hyp3_SVP, Hyp4_SVP, Hyp5_SVP and Hyp6_SVP). DNA and protein binding residues from SVP proteins and SOC1 promoter homeologs, respectively, were specified as active residues while performing docking. Despite identical sequence of MADS domain, the DNA binding residues predicted by I-TASSER for Bju_A09_SVP were 134G, 138V, 139I, 141T, 142K, 143S and 145K. In contrast, 2A, 3R, 4E, 6I, 20T, 23K, 24R, 27G, 30K, and 31K were predicted for Bju_A04_SVP, Bju_B01_SVP, Bju_B02_SVP and Bju_BjuB024255, respectively. To resolve this, COACH analysis was performed exclusively for Bju_A09_SVP which led to the identification of 2A, 3R, 4E, 6I, 20T, 23K, 24R, 27G, 30K, 31K as DNA binding residues. Therefore, docking studies for Bju_A09_SVP were repeated using both the sets of predicted DNA binding residues resulting in an increase in the total number of B. juncea SVP: pSOC1 biomolecular interactions to 48.
The HADDOCK outputs are depicted as multiple clusters of similar models representing diverse structural conformations of the docked complexes. The most stable representative model for each B. juncea SVP: pSOC1 docked complex is provided (Supplementary Fig. 6–8). As expected, all SVP proteins were found to interact with SOC1 promoter homeologs, with either of the strands of the double stranded promoter region via DNA binding MADS domain. Moreover, the promoter region (shown as green strands) flanking the SVP TFBS (shown as blue coloured strands) is also observed to interact with SVP proteins, indicating their significance in DNA-protein interaction.
To analyse the binding affinity of the 48 SVP: pSOC1 docked complexes, Gibbs Free Energy of Dissociation (ΔG) values were calculated using PreDBA. The ΔG values (kcal/mol) representative of the binding affinities of respective SVP: pSOC1 complexes, are provided as a heat map in Fig. 7. In case of 27 docked complexes generated from natural promoters and proteins, significant conservation of pair-wise binding affinities was uncovered despite the structural diversity in individual promoters and SVP proteins. Specifically, the 3 promoter homeologs viz. BjupSOC1_AALF, BjupSOC1_AAMF1 and BjupSOC1_AAMF2 exhibited comparable binding affinity to all but one natural SVP proteins (Bju_A04_SVP, Bju_A09_SVP, Bju_B01_SVP, Bju_BjuB024255, BjuVAR1_SVP, BjuVAR2_SVP, BjuVAR3_SVP and BjuVAR4_SVP). Unexpectedly, ΔG values of 3 promoters with Bju_BjuB024255 SVP protein harbouring additional K-domain also depicts similar binding affinities. Bju_B02_SVP was an exception which depicted considerable increase in the binding affinity for BjupSOC1_AAMF1 (-8.93 kcal/mol) relative to BjupSOC1_AALF (-11.99 kcal/mol) and BjupSOC1_AAMF2 (-11.82 kcal/mol).
Exceptionally high binding affinities were reported for Bju_B02_SVP protein, a truncated protein encoding only MADS domain, with BjupSOC1_AALF (-11.99 kcal/mol), BjupSOC1_AAMF1 (-8.93 kcal/mol) and BjupSOC1_AAMF2 (-11.82 kcal/mol), respectively. This pointed to stabilising effect conferred by I-, K- and C- domains. Analysis of binding strengths of hypothetical SVP proteins with differential presence of M-, I-, K- and C- domains revealed interesting patterns. Hyp1_SVP, Hyp2_SVP, Hyp3_SVP and Hyp4_SVP lacking both K- and C- domains showed significantly higher binding affinities to BjupSOC1_AALF (-11.99 kcal/mol), BjupSOC1_AAMF1 (-8.93 kcal/mol) and BjupSOC1_AAMF2 (-11.82 kcal/mol) relative to SVP proteins which retained either or both these domains (Fig. 7). This observation was in corroboration to the energy patterns observed with truncated natural protein Bju_B02_SVP which lacks both K- and C- domain. These data highlight the impact of mutual absence of K- and C- domains on binding potential of SVP. Interestingly, the I- domain was found to have an insignificant effect on the binding affinities of the SVP proteins to the SOC1 promoter homeologs. Though Bju_B02_SVP, Hyp1_SVP and Hyp3_SVP lacked an I-domain, the binding strengths of these were comparable to Hyp2_SVP and Hyp4_SVP which possessed the I-domain. Broadly, the data suggests that despite natural structural variation in SVP proteins and SOC1 promoters, the binding potential has remained preserved.
3.6. Unique and shared binding patterns underpin molecular interactions among SOC1 promoters and SVP proteins
To fine-map bi-molecular interaction patterns, types of molecular contacts and contact residues were identified on 27 SVP: pSOC1 docked complexes between 9 SVP proteins and 3 SOC1 promoter homeologs from B. juncea. The molecular contacts were screened and observed for variability with respect to non-covalent interactions such as hydrogen bonds (2.2–3.6 Å), π-π stacking (3.6–5Å), Van der Waals (0.3–0.6 Å) and other hydrophobic interactions. To identify π-π stacking (3.6–5Å), the cut-off distance was extended to 5Å. A list of crucial non-covalent interactions is provided in Supplementary Table 12. The amino acid residues involved in these bonds are marked as ‘interacting amino acid residues. As Bju_B02_SVP protein was predicted to have the strongest binding affinity with the 3 B. juncea SOC1 promoter homeologs, the molecular interactions between Bju_B02_SVP: BjupSOC1_AALF, Bju_B02_SVP: BjupSOC1_AAMF1 and Bju_B02_SVP: BjupSOC1_AAMF2 were largely compared with those between a representative protein involving Bju_A04_SVP and 3 B. juncea SOC1 promoter homeologs. The stabilising bonds formed between each SOC1 promoter homeolog and Bju_B02_SVP and Bju_A04_SVP are shown in Fig. 8a-c and 8d-e, respectively.
To delineate plausible conservation pattern of the residues involved in establishing contacts with nucleotides from 3 B. juncea SOC1 promoter homeologs, hydrogen bonds, hydrophobic interactions, π-π stacking and Van der Waals forces were compared (Supplementary Table 12). SVP protein specific amino acid residues mediating hydrogen bond interactions were identified for all B. juncea SVP proteins. For instance, the residues GLN7, LYS10, ARG24, LYS31 and ARG3, GLU4 and LYS5, GLN7 from Bju_B02_SVP and Bju_A04_SVP, respectively, were involved in formation of hydrogen bonds with the 3 B. juncea SOC1 promoter homeologs.
Hydrophobic interactions were also detected in all docked complexes between SVP proteins and SOC1 promoter homeologs from B. juncea, except for Bju_BjuB024255: BjupSOC1_AAMF1 and Bju_BjuB024255: BjupSOC1_AAMF2. However, conservation of SVP specific, hydrophobic interaction forming amino acid residues was observed only for Bju_B02_SVP, Bju_A04_SVP and BjuVAR2_SVP. Amino acid residues LYS5 and ARG3 from Bju_A04_SVP and BjuVAR2_SVP, respectively, were found as involved in forming hydrophobic interactions with all 3 B. juncea SOC1 promoter homeologs. Likewise, Bju_B02_SVP specific residues GLN7 and ARG9 are involved in hydrophobic interactions with all 3 B. juncea SOC1 promoter homeologs. Furthermore, π-π stacking was also observed, albeit not for all 27 docked complexes. Conservation of SVP protein specific π-π stacking was found only for Bju_A04_SVP, Bju_A09_SVP, Bju_BjuB024255 and BjuVAR1_SVP proteins. Interestingly, PHE was majorly responsible for π-π stacking, except for Bju_A09_SVP, where TYR was the key residue. PHE at 21st and 48th position in Bju_A04_SVP, at 21st and 29th position in Bju_BjuB024255 and at 29th position in BjuVAR1_SVP were found to be involved in π-π stacking with all 3 B. juncea SOC1 promoter homeologs. In case of Bju_A09_SVP, TYR at position 152nd was identified as conserved residue involved in π-π stacking. Overall, the interacting amino acid residues from AA sub-genome specific SVP homeologs were found to be involved in all the 4 categories of non-covalent interactions, however, few BB sub-genome specific SVP homeologs displayed distinct pattern as these did not establish π-π stacking.
The set of interacting nucleotides in each docked complex was also compared to examine conservation in pattern. Notably, base pairs at position 15th, 16th, 17th and 18th of SVP specific TFBS within the double stranded B. juncea SOC1 promoter homeologs were found to be involved in interactions in all the 48 docked complexes (Supplementary Fig. 9a and b). It was interesting to note that nucleotides at 15th, 17th and 18th position were not conserved across the 3 promoter homeologs. Nevertheless, these were still predicted to interact with corresponding proteins. A representation of the SVP specific binding site along with 10bp flanking sequence on BjupSOC1_AALF, BjupSOC1_AAMF1 and BjupSOC1_AAMF2 - highlights nucleotide variation at positions 15th, 16th, 17th and 18th of the unaligned SVP TFBSs (Supplementary Fig. 9c). The figure also depicts frequency of occurrence of nucleotides at specific positions in the TFBSs corresponding to SVP from 3 B. juncea SOC1 promoter homeologs, as generated using WebLogo server (https://weblogo.berkeley.edu/). Evidently, the nucleotides at the positions 17 and 18 were found to tolerate naturally occurring transversions since no stearic hindrance-based impact was apparent in the interaction patterns. Since nucleotide positions 15th, 16th, 17th and 18th on promoters BjupSOC1AALF, BjupSOC1AAMF1 and BjupSOC1AAMF2 refer to the positions on double stranded DNA, the regions of interaction imply complementary nucleotides.
The conserved pattern of interacting residues was investigated by way of superposition of docked complexes of a specific protein with three individual SOC1 promoter homeologs. The superposed models of the highest docking affinity depicting docked complexes viz. Bju_B02_SVP: BjupSOC1AALF, Bju_B02_SVP: BjupSOC1AAMF1 and Bju_B02_SVP: BjupSOC1AAMF2 are given in Fig. 9a. This superposition demonstrates the interactions (Table 3) made by the Bju_B02_SVP specific conserved residues GLN7, LYS10 and ARG24 with all 3 SOC1 promoter homeologs. Likewise, superposed models of Bju_A04_SVP: BjupSOC1AALF, Bju_A04_SVP: BjupSOC1AAMF1 and Bju_A04_SVP: BjupSOC1AAMF2 are given in Fig. 9b, which demonstrates the interactions made by the Bju_A04_SVP specific conserved residues ARG3, GLU4 and LYS5 with all 3 SOC1 promoter homeologs. These data highlight the preservation of interacting residues despite the structural differences in B. juncea SOC1 promoter homeologs.
To analyse the overall binding pattern at the B. juncea SVP: pSOC1 interaction interface, 2D illustrations of representative group comprising of 27 docked complexes between 9 natural SVP proteins from B. juncea var. tumida and B. juncea cv. Varuna (Supplementary Table 3) and 3 B. juncea SOC1 promoter homeologs were generated using DNAproDB. The 2D illustrations facilitate the clear depiction of the interface features. The DNAproDB representation of the Bju_A04_SVP: BjupSOC1_AALF complex is shown (Fig. 10) while remaining are provided in Supplementary Fig. 10. The DNA-protein interface was manually selected to display DNA moieties viz. major and minor groove, nucleoside, pentose and phosphate moieties. The secondary structure elements (SSE) such as loops, strands and helices are marked. In the bimolecular interaction between Bju_A04_SVP: BjupSOC1_AALF (Fig. 10), the nucleotides corresponding to the SVP TFBS were involved in establishing contact with the SVP proteins, as expected. However, the interacting nucleotides at positions 14th, 15th, 16th, 17th, 18th and 41st, 42nd, 43rd, 48th, 49th, 50th are present on opposite DNA strands (Fig. 10a). Furthermore, the nucleotides at 51st and 52nd position, flanking the TFBS, were also involved in interaction with Bju_A04_SVP. Additionally, most amino acid residues were observed to bind in the major groove of DNA and make contacts with sugar moieties and phosphate groups of the nucleotides (Fig. 10b). A detailed graph depicting explicit interactions of specific amino acid residues with different nucleotides along the length of the major groove of DNA is also provided in Fig. 10c. The secondary structure to which the interacting amino acids belong are denoted by symbols such as circle-helix, triangle-strand, square-loop. The relative size of these symbols, on the other hand, denote the number of interactions made by the residues. Similar trends were observed for other 14 SVP: pSOC1 interactions (Supplementary Fig. 10). The type and number of interactions in each of the 27 docked complexes is depicted as 2D representation in the DIMPLOT (Fig. 11; Supplementary Fig. 11). Overall, the results from DNAproDB and DIMPLOT analysis are in corroboration with Chimera visualisation. Clearly, nucleotides from both DNA strands mediate the interactions with the SVP proteins. Involvement of nucleotide residues adjoining the TFBS in DNA-Protein interaction is also confirmed.
To validate critical amino acid residues stabilising each of the 48 SVP: pSOC1 complexes in B. juncea, hotspots were identified using PremPDI server. These hotspots are italicised in Table 3. Expectedly, most of the predicted hotspots for each docked complex comprised of residues mapping to the MADS domain. Analysis of entire list of hotspots led to the detection of at least one amino acid hotspot as conserved for each SVP protein irrespective of the SOC1 promoter homeolog that these interacted with. BjuVAR3_SVP was found to be an exception. A diagrammatic representation of homeolog-wise conserved hotspots is also provided (Fig. 12).
3.7. Validation of binding affinity preservation by in-vivo yeast one-hybrid analyses
A total of 8 interactions between 2 B. juncea SOC1 promoter homeologs and 4 SVP proteins were analysed using yeast one-hybrid assays. The minimum inhibitory concentration of Aureobasidin A was found as 250 ng/ml for BjupSOC1_AALF_Frag, BjupSOC1_AAMF1_Frag and yeast strain harbouring empty pAbAi (Fig. 13a).
Yeast one-hybrid assays confirmed the binding of BjuVAR1_SVP, BjuVAR3_SVP, BjuVAR4_SVP and BjuVAR5_SVP proteins to BjupSOC1_AALF_Frag and BjupSOC1_AAMF1_Frag. Spotting the cultures of yeast harbouring specific bait and prey plasmids, with OD ranging from 1 to 0.00001 on SD/-Leu/+AbA plates facilitated the quantification of the binding strength of SVP proteins with SOC1 promoter fragments. The binding strength of BjuVAR1_SVP (10− 5 dilution) and BjuVAR3_SVP (10− 5 dilution) proteins with BjupSOC1_AALF_Frag was found to be greater than that of BjuVAR4_SVP (10− 3 dilution) and BjuVAR5_SVP (10− 2 dilution) with BjupSOC1_AALF_Frag (Fig. 13b). However, the binding strength of BjuVAR1_SVP, BjuVAR3_SVP, BjuVAR4_SVP and BjuVAR5_SVP proteins to BjupSOC1_AAMF1_Frag was found as comparable (Fig. 13c). The negative controls for each interaction are depicted in Fig. 13d. Since the minimum inhibitory concentration for BjupSOC1_AAMF2_Frag could not be achieved. Therefore, interaction analyses of SVP proteins with BjupSOC1_AAMF2 could not be performed. Overall, the yeast one-hybrid results validated preservation of binding potential as indicated by docking analyses.