Identification and subcellular localization of SaWRKYs
A total of 64 SaWRKY genes named SaWRKY1 to SaWRKY64 were identified (Table 1) by searching the S. album genome and transcriptome datasets using total A. thaliana AtWRKY genes as queries. All of the identified SaWRKY proteins contained at least one highly conserved heptapeptide WRKYGQK domain, while 13 out of 64 SaWRKY proteins contained two WRKYGQK domains. The length of the amino acid sequence in the 66 SaWRKY proteins (Table 1, Additional file 1, Table S1) ranged from 121 (SaWRKY45) to 764 (SaWRKY12) amino acids, with an average of 384 amino acids. A C-C-H-H type zinc-finger motif was found in 56 SaWRKY proteins whereas SaWRKY7, 9, 28, 47 and 57 had a C-C-H-C type zinc-finger motif, and other variants of zinc-finger motifs such as C-C-H-T (SaWRKY12, 53), C-C-H-L (SaWRKY25, 51), C-C-H-V (SaWRKY42), C-C-H-Y (SaWRKY45), and C-C-H-S (SaWRKY58), were also found (Table 1).
Based on a prediction by the PSORT program, the subcellular localization of most of the 64 SaWRKY proteins was in the nucleus, except for SaWRKY8, which was found in the Golgi apparatus, SaWRKY45 in the cytoplasm, SaWRKY46 in the peroxisome, and SaWRKY49 in the chloroplast (Table 1).
Phylogenetic analysis of SaWRKY proteins
To understand the evolutionary relationship between SaWRKY proteins, the 64 identified SaWRKY proteins was examined based on AtWRKY proteins from the three groups, and an unrooted tree was built by MEGA6.0 software using the NJ method (Fig. 1). All 64 SaWRKY proteins were classified into three major groups (I, II and III) (Table 1, Fig. 1). Among the 15 SaWRKY proteins in group I, 13 contained two WRKY domains and the remaining two SaWRKY proteins (SaWRKY8 and 25) contained only one WRKY domain (Table 1). There were 43 SaWRKY proteins with only one WRKY domain in group II, and these could be further divided into an additional five subgroups, i.e., IIa, IIb, IIc, IId, and IIe. SaWRKY7, 9, 28, 47, 53 and 57, which contained only one WRKY domain, formed group III (Table 1).
Exon-intron organization of SaWRKY genes
To further understand the pivotal role that exon-intron structural features play in the evolution of S. album gene families, the structure of SaWRKY genes was obtained through exon-intron organization analysis. Among the 64 SaWRKY genes, three had one intron and two exons, 35 had two introns and three exons, six had three introns and four exons, nine had four introns and five exons, two had five introns and six exons, three had five introns and five exons, while the remaining SaWRKY49 had six introns and seven exons, and SaWRKY54 had 13 introns and 14 exons (Fig. 2b). It is noteworthy that SaWRKY genes in the same subgroup had a similar intron and exon composition, such as two introns and three exons in group II, subgroups IId, IIe and group III, or four introns and five exons in subgroup IIb. SaWRKY54 had 14 exons, SaWRKY49 had seven exons, and SaWRKY2 and SaWRKY16 had only two exons. This indicates an occurrence of both exon gain and loss during evolution of the SaWRKY gene family, thus leading to functional diversity among SaWRKY genes.
Motif composition of SaWRKY proteins
To gain insight into the functional regions of SaWRKY proteins, the MEME program was used to predict the composition of the 64 SaWRKY protein motifs. A total of 20 conserved motifs were detected (Fig. 2c). Among them, motifs 1 and 3 contained the heptapeptide stretch WRKYGQK while all 64 SaWRKY proteins contained one or two WRKYGQK motifs. Motif 2 was the conserved zinc-finger structure at the C-terminal end and was found in 61 SaWRKY proteins, but not in SaWRKY8, 11 and 51 (Fig. 2). Motif 9 was unique to all members of subgroup IId, motif 16 was unique to SaWRKY42, 43 and 52, and motif 17 was unique to SaWRKY30 and 36. Similar motif compositions were found in the same groups, especially in the same subgroups, such as in subgroups IIa or IIb.
Prediction and functional enrichment analysis of potential SaWRKY target genes
A total of 13,306 genes that contained at least one W-box in their putative promoters were identified in the assembled S. album genome. The number of genes decreased as more W-boxes were identified. Among all 13,306 genes, 2563 genes contained one W-box, 3007 genes contained two W-boxes, while 1328 genes contained at least five W-boxes and were used for further pathway enrichment analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. As shown in Fig. 3, the top enriched KEGG pathways included plant-pathogen interactions, environmental adaption, metabolism and organismal systems. These results indicate that SaWRKY genes are closely involved in biotic and abiotic stress responses, as well as in other biological pathways.
Expression patterns of SaWRKY genes in different tissues
A gene expression pattern may reflect its biological function. To explore the possible functions of SaWRKY genes in S. album development, the expression patterns of 41 SaWRKY genes in various tissues (leaves, roots, heartwood, sapwood, and transition zone) were obtained from the transcriptome data (Fig. 4). Five SaWRKY genes (SaWRKY16, 20, 22, 3 and 44) showed a higher expression level in wood tissue (heartwood, sapwood, and transition zone) than in leaves and roots. The expression of SaWRKY2, 15 and 27 occurred preferentially in heartwood. Higher levels of mRNA were observed in roots for SaWRKY4, 10, 26 and 34 while 12 SaWRKY genes (SaWRKY3, 5, 8, 9, 17, 21, 28, 29, 33, 36, 38, and 41) showed a higher level of expression in leaves than in other tissues. SaWRKY6, 25, 35 and 37 had a low level of expression in all the tissues examined, while SaWRKY12, 30, 31, 39 and 42 had consistently high expression levels in all tissues (Fig. 4). It was not possible to assess the expression level of 23 SaWRKY genes in all tissues from the transcriptome data.
Expression profiles of SaWRKY genes in response to SA and MeJA
To detect whether the SaWRKY genes were induced by different hormones, RT-qPCR was performed to determine the expression levels of the 42 SaWRKY genes when stimulated by SA and MeJA in callus. The data shows that nine out of 42 SaWRKY genes were up-regulated by SA, namely SaWRKY1, 3, 9, 11, 25, 28, 37, 38 and 40 (Fig. 5A, Additional file 2: Table S2). In contrast, 25 SaWRKY genes were down-regulated by SA, and were mainly dispersed in group I and subgroup IIc as well as in a few members of subgroups IIb, IId and IIe. Several SaWRKY genes were regulated by MeJA. Among them, SaWRKY1, 3, 4, 7, 8, 11, 16, 26, 29, 36, 38 and 42 were up-regulated whereas SaWRKY10, 13, 24, 25, 27, 28 and 38 were down-regulated (Fig. 5B, Additional file 3: Table S3). Interestingly, four SaWRKY genes (SaWRKY1, SaWRKY3, SaWRKY11 and SaWRKY38) were up-regulated both by SA and MeJA.
Characterization, tissue expression patterns and subcellular localization of SaWRKY1
As indicated above, the expression level of SaWRKY1 was not detected in five tissues from the transcriptome data. More interestingly, since SaWRKY1 is one of the genes that was up-regulated both by SA and MeJA, it was selected for further analysis. The full length of the coding nucleotide sequence of SaWRKY1 was 999 bp, encoding a protein of 332 amino acid residues with a predicted theoretical isoelectric point and protein molecular weight of 7.55 and 37.01 kD, respectively.
Tissue expression patterns, which were examined by RT-qPCR, showed that transcription of SaWRKY1 took place mainly in callus, leaves and roots (Fig. 6). The abundance of SaWRKY1 mRNA accumulated preferentially in callus (Fig. 6).
The subcellular localization of the SaWRKY1 protein is most likely in the nucleus. To verify its subcellular location, a C-terminal YFP fusion construct for the SaWRKY1 protein and a nuclear location protein mCherry was co-transformed into Arabidopsis mesophyll protoplasts. The SaWRKY1-YFP fusion protein was localized in the nucleus and co-localized with mCherry (Fig. 7), demonstrating that the SaWRKY1 protein is located in the nucleus, in accordance with its putative function as a TF.
Overexpression of SaWRKY1 enhances salinity tolerance in transgenic Arabidopsis plants
SaWRKY1 was significantly induced both by SA and MeJA, indicating that it might be a node of convergence in signal transduction pathways mediated by SA and MeJA. To further explore the roles of SaWRKY1 in abiotic stress responses, the 35S: SaWRKY1: pCAMBIA1302 construct was transformed into A. thaliana, and two independent T3 homozygous progeny lines were used for further analysis.
In order to examine the role of SaWRKY1 in abiotic stress, the two A. thaliana transgenic lines, which showed the same level of germination, growth properties, and chlorophyll content (Fig. 8A, 9A), were randomly selected and irrigated with 300 mM NaCl. After three days, both transgenic lines grew well but the wild type control showed obvious inhibited growth (Fig. 8A, B; Fig. 9A, B). The content of chlorophyll a, b, and a + b in the wild type declined by 34.6%, 36.1% and 35.5%, respectively compared with the content of each component averaged over the two transgenic lines (Fig. 8B, 9B). When irrigated once again with 300 mM NaCl, unlike the wild type plants which withered and nearly died, the 35S:WRKY1 transgenic lines showed little withering, had greener leaves and slightly inhibited growth after 5 days (Fig. 8A, B, C). Accordingly, either the content of chlorophyll a and b, or the total chlorophyll content was reduced by more than 80% in wild type A. thaliana compared to the content of each component in 35S:WRKY1 transgenic lines (Fig. 8C, 9C).