- Gene duplication profiles in abiotic stress responsive genes in thaliana and B. rapa.
Gene duplication or Paleoploidization is thought to contribute in the evolution of ultmorphological and ecological diversity. It is a phenomenon which allows the genes to gain novel features by neofunctionalization and sub functionalization [3,4]. It has been reported earlier that the abiotic and biotic stress responsive genes were increased after whole genome duplication (WGD) [28]. In order to reinvestigate the role of gene duplication in increased stress (abiotic) tolerance in Brassica rapa than A. thaliana, we have compared duplication status between abiotic stress resistance (ASR) genes of A. thaliana with their corresponding orthologs present in B. rapa. We have noticed that proportion duplicates in stress resistant genes in B. rapa is significantly (P value = 0.00001, Z proportionality test) higher (73.47%) than the stress resistance genes in A. thaliana (52.09%). Moreover, the average number of paralogs of ASR genes are also found to be significantly (P value=0.0001, Mann-Whitney-U test) higher in B. rapa (3.3) than A. thaliana (1.8). These observations clearly reflect the fact that Brassica species underwent five rounds of WGD whereas A. thaliana experienced two rounds of WGD [6, 29]. Thus, this result suggests that duplication may play possible roles in maintaining the stress resilience attribute.
Formerly, it has been reported that different modes of duplication influence the functional role in a biased way [30]. Therefore, in our study mode of duplication of the stress genes was derived and were classified into one of the four categories being derived from WGD (Whole genome duplication), Tandem duplication (TD), Transposed duplication (TRD) and Dispersed Duplication (DSD). The number of duplicate gene pairs for each category in each taxon was determined. Percentage of genes underwent different modes of duplication in A. thaliana and B. rapa is delineated in Figure 1. From the above figure it could be clearly depicted that the percentage of stress genes experiencing WGD is significantly (P value=0.0001, Mann-Whitney-U test) higher in B. rapa (23.5%) than that of A. thaliana (12.9%). Thus, WGD could be considered as a driving force for the evolution of stress adaptive genes from A. thaliana to B. rapa as also described by Rizzon [28].
- Enrichment of intrinsically disordered regions in proteins encoded by duplicated pairs of ASR genes in thaliana and B. rapa
Ohnologs, the duplicates derived by whole genome duplication were reported to contain more intrinsically disordered residues than small scale duplicates [31]. It was also reported in plant system that the proteins of dehydrin family being almost completely disordered could involve in the response to drought and other environmental stresses [32]. Thus, we have analysed intrinsically disordered protein profile of ASR coded genes in the two species. Consistent with the previous study [31], we have found that the intrinsically disordered residues (IDRs) are significantly more enriched in WGD derived duplicated proteins than tandem, dispersed and transposed duplicates in both species (Figure 2). Consequently, it is also evident from our result that in both plants, duplicated ASR proteins are more frequently contain IDRs than singleton ASR genes (Figure 3). Because of this fact, average disordered residues are also significantly higher in duplicated ASR encoding proteins compared to singleton ASRs in A. thaliana (Average = 0.20) and B. rapa ( Average= 0.27). Thus, it could be manifested from the above facts that WGD is associated with the enhancement of IDRs in proteins. If so, it could be assumed that ohnologs of B. rapa encode more intrinsically disordered proteins than A. thaliana since former is underwent three extra round of WGD. Consistent with our conjecture, we too found that 40.5 % of ohnolog pairs in B. rapa contain at least one intrinsically disordered region but in A. thaliana it is only 29.9 %. The difference is significant at P value<0.0001 in Z-proportionality test. Moreover, the average percentages of intrinsically disordered residues (IDRs) are significantly (P = 0.001, Mann-Whitney U test) high in the ohnologs of B. rapa compared to A. thaliana (Figure 4). Moreover, we also retrieved a significant positive correlation between WGD derived paralogs number and content of intrinsically disorder residues in B. rapa(Spearman’s’ ρ =0.1547, P=2e-06) but not in A. thaliana. Hence it would be interesting to investigate how the IDRs in proteins acquired through WGD during evolution stimulate the stress adaptation potentiality in B. rapa than A. thaliana.
- Functional divergence in duplicated pairs of ASR genes originated through whole genome duplication in thaliana and B. rapa
Genome duplication (polyploidy) is a conventional phenomenon of plant evolution. It was proposed earlier that functional diversification of the surviving duplicated genes is one of the prime attributes for the long-term evolution of polyploids [33]. It is widely accepted that enrichment of intrinsically disordered residues in proteins could able to impose them diverse functional potentiality because of their high flexibility in nature and IDRs are also reported to create functional divergence between ohnologs after WGD [24]. We have measured the functional divergence based on the Gene Ontology terms between the WGD derived duplicates in the two selected species. We have found that paralogs of 67.30% genes in B. rapa turned into functionally diverged after whole genome duplication event whereas it is only 50% for A. thaliana. The difference in the proportion of gene in these two species is also significant at 0.99 confidence level. The average functional divergence between the paralogous pairs in B. rapa is also found to be significantly (Mann-Whitney U test, P = 0.009) higher than the paralogous pairs in A. thaliana (B. rapa = 0.90; A. thaliana- = 0.88). Next, we checked the correlation between IDRs and functional divergence between duplicated pairs in our datasets and received a strong positive correlation between them in B. rapa but not in A. thaliana (Spearman’s’ ρ = 0.1458, P=6.539e-05).These results suggest that the increased disordered residues in duplicated pairs of B. rapa may play an important role in creating new functions in ohnologs which in turn impose more stress resistance potentiality in B. rapa compared to A. thaliana.
To give a detailed insight into the functions which are gained after WGD, we have performed functional enrichment analysis of ASR genes and their corresponding paralogs in B. rapa and A. thaliana. We have noticed that three additional stress related functions (response to heat, response to abiotic stimulus, response to temperature stimulus) are more successfully enriched in the paralogs of B. rapa than their ancestral genes (Figure 5) but in case of paralogs in A. thaliana no such significant stress related functional enrichment is observed (Figure 5). These results indicate that IDRs in duplicated proteins because of their flexible nature can create functional divergence by changing protein interaction pattern which in turn could excite more stress adaptation potentiality in B. rapa.
- Enrichment of protein domains encoded by duplicated pairs of ASR genes in thaliana and B. rapa
Protein domains entail functional modularity, and it is likely that functional divergence between paralogous pairs exploits this modularity. Experimental evidences reported few domains are assigned for stress tolerance in plants [34] we have assessed protein domains in duplicated pairs by scanning their amino acid sequences. We noticed that duplicated pairs of ASR genes in B. rapa encode significantly (Mann-Whitney U test, P = 0.024) higher number of domains than the duplicated pairs in A. thaliana (B. rapa- average no. of domain = 3.9, A. thaliana, average no. of domain = 1.20). Interestingly, when we have performed domain enrichment analysis by Fisher Exact test, we found during divergence from A. thaliana, B. rapa lost some domains as well as several new domains are gained in the ohnolog pairs of B. rapa (Figure 6). Next, we have analysed domain ontology by dcGO to compare functional enrichment of unique domains present in A. thaliana and B. rapa duplicated pairs. Figure 7 has clearly depicted that stress related functions are more enriched in the domain which are solely present in B. rapa than the domains in A. thaliana. The domains in B. rapa which are annotated with the stress related functions are enlisted in Table 1. In previous research, the distribution of disordered regions in GATA- type binding Domain is responsible for manifesting functional specificity [35]. Therefore, we also analysed the domain content in disordered regions of the protein. The results we found was consistent with the previous results where the percentage of stress specific domain in disordered regions (9/14=64.3%) was significantly (P=0.001, Z proportionality test) higher than the domain present in the ordered domain (5/14=35.7%) (Table1). The enrichment of stress related domains in the disordered regions of duplicated pairs provides an important cue that genome triplication in B. rapa helps to encode disordered residues which might help in their stress resistance.
Table 1. Table enlisting domains in B. rapa annotated with the stress related functions. Annotated domain with disorder are Bold marked
GO term
|
Annotated domain
|
Cellular response to environmental stimulus
|
PF00005, PF00010, PF00013, PF00027
|
Cellular response to light stimulus
|
PF00010, PF00013, PF00027
|
Defense response
|
PF00031, PF00240, PF00314
|
Positive regulation of response to stimulus
|
PF00004, PF00005, PF00010, PF00013, PF00023, PF00026, PF00240
|
Regulation of DNA-templated transcription in response to stress
|
PF00004, PF00012, PF00023, PF00240
|
Regulation of response to external stimulus
|
PF00010, PF00013, PF00031
|
Regulation of response to stimulus
|
PF00004, PF00005, PF00010, PF00012, PF00013, PF00023, PF00026, PF00027, PF00031, PF00240
|
Regulation of response to stress
|
PF00004, PF00010, PF00012, PF00013, PF00026, PF00031, PF00240
|
Regulation of transcription from RNA polymerase II promoter in response to stress
|
PF00004, PF00012, PF00023, PF00240
|
Response to abiotic stimulus
|
PF00004, PF00005, PF00006, PF00010, PF00011, PF00012, PF00013, PF00023, PF00027, PF00155
|
Response to biotic stimulus
|
PF00005, PF00012, PF00026, PF00240, PF00314
|
Response to external stimulus
|
PF00005, PF00010, PF00012, PF00026, PF00027, PF00240, PF00314
|
Response to heat
|
PF00004, PF00011, PF00012, PF00023
|
Response to hypoxia
|
PF00004, PF00010, PF00155, PF00240
|
response to stress
|
PF00004, PF00005, PF00010, PF00011, PF00012, PF00023, PF00031, PF00155
PF00240,PF00314
|
response to temperature stimulus
|
PF00004, PF00010, PF00011, PF00012, PF00013, PF00023
|
- Statistical analysis of IDRs and stress resistance potentiality of rapa
Here, we have noticed that whole genome gene duplication triggers the enrichment of intrinsically disordered residues in proteins. These IDRs mediate functional divergence in the ohnologs as well as confer the sites for enriching new domains. The functional divergence in the duplicated pairs and the new domains originated during evolution of A. thaliana to B. rapa contribute a lot for escalating stress tolerance potentiality in B. rapa. Thus, we intend to explore whether these four factors (no. of paralogs, functional divergence, IDR content, domain content) in ohnologs acts mutually inclusive way or they are independent in B. rapa. For this, we have categorized stress related genes in two classes- mono stress (associated with one stress condition) and poly stress (associated with more than one stress condition). Then, we have performed linear regression analysis with stress and stress regulatory four factors. We found that only IDRs could independently control stress resistance potentiality in Brassica rapa (Spearman’s’ ρ = 0.185, P= 5.943e-03).