Chromosome location of WRKY genes
To further analyze the distribution of WRKY genes, the chromosome location was mapped (Fig. 2). In G. arboretum and G. raimondii, 109 and 119 WRKYs were located in At or Dt sub-genome, respectively. The Ga14G1656, Ga14G1714, and Ga14G1560 were positioned in contigs (Fig. 2A, B). The number of WRKY genes in A07 chromosome was the highest (13), while the A03 chromosome had the lowest number (4) in G. arboreum (Fig. 2A). In G. raimondii, the D01 and D09 chromosomes contained the highest number of GrWRKYs (13), while the D02 and D05 chromosomes had the lowest number of GrWRKYs (4) (Fig. 2B). In G. hirsutum, the chrA05 had 16 WRKY members, which was the highest number among all 26 chromosomes, while the chrA03 contained 3 GhWRKYs. The Gh_Contig00579G000600, Gh_Contig00383G000300, and Gh_Contig01109G001300 were located in contigs that were not observed in chromosomes (Fig. 2C). In G. barbadense, 4, 5, 3, 3, 17, 10, 12, 11, 6, 7, 14, 9, and 5 WRKYs were mapped on chrA01-chrA13, and 3, 4, 5, 11, 12, 10, 13, 11, 5, 7, 13, 12, and 5 WRKY genes were located on chrD01 to chrD13 (Fig. 2D). Therefore, the members of the WRKY family were unevenly distributed across the cotton chromosomes.
Conserved motifs and domains, cis-acting elements, and gene structure of WRKYs
To further clarify the detailed characteristics of WRKYs, gene structure, conserved motifs and domains, and cis-acting element were analyzed. 10 conserved motifs of WRKY proteins were identified in 4 cotton strains. We noted that most WRKY proteins contained more than two motifs, except for 5 GaWRKY, 15 GrWRKY, and 30 GbWRKY proteins. Moreover, motifs 1 and 2 were present in each WRKY member (Supplementary Table 3). Subsequently, we observed that WRKY proteins contain at least one WRKYGQK domain, and 107 WRKY proteins belonging to Group I have two WRKYGQK domains (Fig. 3). Additionally, the basic region-leucine zipper (bZIP) domains (PF00170) were found in 20 WRKY proteins, and the plant_zn_clust (PF10533) structures were located at the N-terminus of 80 WRKY members (Fig. 3). The results suggest that the WRKY proteins are evolutionarily stable and diverse in cotton.
To further investigate the biological function of WRKYs, we identified cis-acting elements in the 5'-upstream regions of 2000 bp (Supplementary Table 4). 12 different functions of cis-elements were identified, and these cis-acting elements related to stress response were found abundantly in the promoter of WRKYs (Fig. 3). The cis-acting elements can be divided into 3 categories: hormones responsive sites (auxin responsive element, gibberellin responsive element and MeJA responsive element), transcription factor binding sites (MYB binding site, MYBHv1 binding site, and WRKY binding site), and growth and development sites (MYB binding site involved in drought-inducibility, light responsiveness, flavonoid biosynthetic, low-temperature responsive, and defense and stress responsive) (Fig. 3).
Duplication and collinearity of GaWRKYs, GrWRKYs, GhWRKYs, and GbWRKYs
To gain a better understanding of the expansion pattern of WRKYs, the duplication circos was plotted. In the diploid genomes of G. raimondii and G. arboretum, 88 and 102 WRKYs, respectively, occurred whole genome duplication (WGD) or segmental duplication events (Fig. 4A, B). The Ga05G0631, Ga08G2219, Gorai.001G037800, Gorai.004G219400, and Gorai.009G062400 were tandem duplications and were distributed on chromosomes A05, A08, D01, D04, and D09, respectively (Fig. 4A, B). Furthermore, 17 and 14 WRKY genes were dispersed in At or Dt sub-genome (Supplementary Table 5). In G. hirsutum, 97.66% of WRKYs occurred through WGD or segmental duplication events. The 3 WRKY genes (Gh_A08G214800, Gh_D05G062100, Gh_D08G210400) were tandem duplications, and 2 WRKYs (Gh_D04G011700 and Gh_D07G055500) were dispersed. These 5 GhWRKYs were located on chromosomes A08, D04, D05, D07, and D08, respectively (Fig. 4C) (Supplementary Table 5). A total of 213 WRKYs had undergone whole-genome duplication (WGD) or segmental duplication events in G. barbadense. The Gbar_A08G020300 (ChrA08), Gbar_D08G021260 (ChrD08), Gbar_A03G013230 (ChrA03), and Gbar_A11G020420 (ChrA11) appeared as tandem duplications or dispersion (Fig. 4D) (Supplementary Table 5).
It is widely known that G. hirsutum and G. barbadense evolved through hybridization between G. arboreum (A-genome species) and G. raimondii (D-genome species). Herein, a relative syntenic map was drawn to analyze the evolutionary relationships of WRKYs between G. hirsutum and 3 other species (Supplementary Table 6). According to MCScan analysis, 571, 621, and 1044 duplicated gene pairs were found between G. hirsutum and G. arboreum, G. hirsutum and G. raimondii, and G. hirsutum and G. barbadense (Fig. 5). In G. arboreum and G. raimondii, the highest number of collinear relationships was found on ChrA11 (85) and ChrD07 (98). The minimum collinear relationships, which were mapped to ChrA09 and ChrD05, were 17 and 21, respectively (Fig. 5A, B). Meanwhile, there are 87, 75, 73, 65, and 78 collinear relationships located on A05, A11, D05, D07, and D11 of G. barbadense among the 1044 gene pairs (Fig. 5C). In short, the aforementioned results confirmed that the collinear relationships were unevenly distributed across each chromosome, and that deletion and duplication events occurred among WRKY family members.
Expression profiling and qRT-PCR verification of WRKYs response to salt and osmotic stress in G. hirsutum
The transcriptome data of G. hirsutum was used to detect the expression profile of GhWRKYs under salt and osmotic stress. We found that 3 h of salt treatment, and 3 and 6 h of PEG treatment were grouped into one clade. The WRKY genes, including Gh_A05G156700.1, Gh_D03G026500.1, Gh_A05G368400.1, and Gh_A06G109400.1, exhibited higher expression levels in this clade. Furthermore, the expression levels of Gh_A05G156700.1, Gh_D02G067800.1, Gh_A08G149000.1, Gh_D08G210300.1, and Gh_D08G191400.1 reached peak values after 1 h and 6 h salt treatment or 1 h PEG treatment. Additionally, the Gh_A08G031700.1, Gh_A09G013200.1, and Gh_D03G050200.1 were expressed at the highest levels after 12 h salt treatment or 12 h PEG treatment. The results showed that partial GhWRKYs were regulated under salt and osmotic stress (Fig. 6).
To further verify the above results, qRT-PCR was employed to examine the changes in GhWRKYs expression levels. The Gh_A08G031700.1, Gh_D02G067800.1, and Gh_A05G156700.1 were selected, and their expression levels were differentially induced by PEG and NaCl solution (Fig. 7). The expression values of Gh_A08G031700.1 and Gh_D02G067800.1 were variously under PEG and NaCl treatment. Concretely, the Gh_A08G031700.1 expression was sensitive to both PEG and NaCl treatment, and its expression level was up-regulated at 3 and 12 h under PEG treatment, and at 1, 12, and 24 h under NaCl treatment (P < 0.05) (Fig. 7A). The Gh_D02G067800.1 was regulated by salt stress, and its expression value significantly increased at 6 and 12 h under NaCl treatment (P < 0.05) (Fig. 7B). Additionally, the expression level of Gh_A05G156700.1 was promoted at 1, 3, and 6 h (P < 0.05) after PEG treatment, and it was also upregulated after NaCl treatment at 3, 6, and 12 h (Fig. 7C). Therefore, the Gh_A05G156700.1 was selected for the next functional verification under salt and osmotic stress.
GhWRKY31 improved the tolerance of transgenic Arabidopsis to osmotic and salt stress
The qRT-PCR result verified that Gh_A05G156700.1 was positively induced by both osmotic and salt stress. To further investigate the function of GhWRKY31 (Gh_A05G156700.1), the tolerance to osmotic and salt stress of GhWRKY31 transgenic Arabidopsis was carried out after homozygote molecular identification (Supplementary Figure S1). The seed germination and root length both were inhibited by mannitol and NaCl treatment in WT Arabidopsis. The germination rates of WT were severely suppressed under 100 mM (83%), 200 mM (76%), and 300 mM (58%) mannitol treatment (Fig. 8A, B), and were suppressed to 56% and 36% under 100 mM and 150 mM NaCl treatment, respectively (Fig. 8E, F). Meanwhile, the root length of WT was also inhibited under 100 mM (2.84 cm), 200 mM (2.23 cm), and 300 mM (1.62 cm) mannitol (Fig. 8C, D), and was suppressed to 2.24, and 1.84 cm under 50 mM, and 100 mM salt conditions, respectively (Fig. 8G, H). As expected, the germination rates and root length of GhWRKY31 OE lines were significantly higher than those of WT. The germination rates were nearly 100%, 100%, and 90% under 100 mM, 200 mM, and 300 mM mannitol. Under 50 mM, 100 mM, and 150 mM NaCl solution, the germination rates of GhWRKY31 OE lines were almost up to 100% (Fig. 8B, F). In addition, the root length of OE lines was 3.36 cm, 3.14 cm, 2.33 cm, 3.24 cm, and 2.37 cm under 100 mM, 200 mM, 300 mM mannitol, 50 mM, and 100 mM NaCl treatments, respectively. These measurements were significantly longer than those of WT (P < 0.05) (Fig. 8D, H). Hence, the heterologous expression of GhWRKY31 in Arabidopsis significantly improved osmotic and salt tolerance.
VIGS of GhWRKY31 reduced drought and salt tolerance in G. hirsutum
To further elucidate the function of GhWRKY31 in G. hirsutum Tm-1, VIGS was employed to decrease the transcription level of GhWRKY31. The qRT-PCR was used to evaluate the silencing efficiency of GhWRKY31. The expression level of GhWRKY31 was reduced by approximately 75% in pYL156: GhWRKY31 plants (Supplementary Figure S2). As expected, no stress-related phenotype was observed in the seedlings of ‘TM1’, ‘TM1 + pYL156: 00’, and ‘TM1 + pYL156: GhWRKY31’ under water conditions. Nevertheless, the leaves of ‘TM1 + pYL156: GhWRKY31’ seedlings exhibited shrinkage and yellowing characteristics compared with ‘TM1’ (WT seedlings) and ‘TM1 + pYL156: 00’ (empty vector seedlings) under 200 mM NaCl treatment (Fig. 9A). Meanwhile, after a 14-day water-deficit treatment, the leaves of 'TM1' and 'TM1 + pYL156: 00' showed a healthier phenotype compared to 'TM1 + pYL156: GhWRKY31' seedlings. The latter exhibited symptoms such as shrinkage, rolling, wilting, and death (Fig. 9A). Additionally, the ABA and proline contents accumulated less in 'TM1 + pYL156: GhWRKY31' seedlings than in the control group seedlings under drought and salt stress. Meanwhile, the MDA content was higher in 'TM1 + pYL156: GhWRKY31' seedlings compared to the seedlings in the control group. Moreover, the activities of POD and SOD were higher in plants of the control group than in 'TM1 + pYL156: GhWRKY31' plants under drought and salt stress (Fig. 9B-F).
GhWRKY31 regulates the expression of salt- and dehydration-induced genes
The GhWRKY31-VIGS cotton seedlings showed greater sensitivity to drought and salt stress. To elucidate the target genes of GhWRKY31 in the cotton drought and salt response, we conducted qRT-PCR analysis to determine whether GhWRKY31 is essential for the expression of ABA-, dehydration-, and salt-induced genes. The expression levels of GhRD29, GhNAC4, GhABF1, GhABF2, GhDREB2, GhP5CS, and GhSOS1 were induced in the control group under drought and NaCl stress. However, silencing GhWRKY31 resulted in a decrease in the induction of the 7 genes under dehydration and salt stress. Specifically, the expression levels of GhABF1, GhABF2, GhP5CS, and GhSOS1 were suppressed to levels lower than those observed in the control group (Fig. 10A).
The W box (TTGACC/T), which is the minimal sequence required for specific DNA binding of WRKY family members, was explored in the above 7 genes. We found that 1, 2, 2, 1, and 3 W boxes (TTGACC) were located in the promoter regions of GhP5CS, GhABF1, GhRD29, GhABF2, and GhDREB2, respectively (Fig. 10B). Additionally, to explore the potential interaction sites between the GhWRKY31 protein and the W box of these 5 genes, molecular docking was performed using HDOCK and PyMOL software. The confidence scores for the combination of GhWRKY31 and GhP5CS, GhABF1, GhRD29, GhABF2, and GhDREB2 were 0.8611, 0.9525/0.9050, 0.7619/0.8815, 0.8930, and 0.8576/0.8654/0.9492, respectively. Stable complexes could be formed between the WRKY domain of GhWRKY31 and the adjacent W box sequence. These complexes were maintained by strong hydrogen bonds (Fig. 10C).
GhWRKY31 binds to the promoter regions of GhABF1, GhDREB2, and GhRD29
The Yeast one-hybrid (Y1H) assay was employed to further investigate the binding affinity of GhWRKY31 protein to GhP5CS, GhABF1, GhABF2, GhDREB2, and GhRD29. Firstly, we confirmed that 100 ng/ml of AbA could inhibit the self-activation of pAbAi-bait. The results showed that the transformation yeast containing the combination of GhWRKY31 with the W box (TTGACC/T) of GhABF1, GhDREB2, GhRD29, GhP5CS, and GhABF2 grew on SD/-Leu medium. The GhWRKY31 protein specifically bound to the fragment that contained the core TTGACC/T motif of GhABF1, GhDREB2, and GhRD29 on the SD/-Leu + AbA (100 ng/ml) medium. These findings support the conclusion that GhWRKY31 directly binds to the promoter regions of GhABF1, GhDREB2, and GhRD29 (Fig. 11).