Identification of HATs and HDACs protein in wheat
To identify HATs and HDACs in the genomes of wheat, a systematic blast search was performed using Arabidopsis and rice sequences as queries. Pfam and InterProScan databases were used to further verify the candidate HATs and HDACs based on structural domains. In total, 30 HATs and 53 HDACs were identified in the wheat genome (Table 1, Table S1). The polypeptide lengths of HATs and HDACs were 438–1796 and 309–693 amino acids, respectively, the predicted molecular weights were 50.14–201.47 and 33.16–74.54 kDa, and the theoretical isoelectronic point (pI) values were 4.71–8.9 and 4.6–9.42. In addition, the intron–exon organization of these HATs and HDACs were analyzed. The numbers of conserved coding regions ranged from 9 to 21 in HATs and from 1 to 17 in HDACs. With respect to subcellular localization, most HATs were detected in the nucleus, but HDACs were located in the nucleus as well as in the chloroplasts, cytoplasm, mitochondria, cytoskeleton, etc.
Table 1
Histone acetyltransferases (HATs) and Histone deacetylases (HDACs) identified in wheat
Subfamily
|
Gene name
|
Gene ID
|
Protein length
|
Localization
|
HATs family
|
GNAT
|
TaHAG1A
|
TraesCS1A02G138200.2
|
507
|
nucl, chlo
|
TaHAG1D
|
TraesCS1D02G134200.1
|
507
|
nucl, chlo
|
TaHAG1U
|
TraesCSU02G003200.1
|
507
|
nucl, chlo
|
TaHAG2A
|
TraesCS5A02G197700.1
|
463
|
cyto
|
TaHAG2B
|
TraesCS5B02G186000.1
|
463
|
cyto
|
TaHAG2D
|
TraesCS5D02G193200.1
|
463
|
cyto
|
TaHAG3A
|
TraesCS2A02G320900.1
|
569
|
cyto
|
TaHAG3B
|
TraesCS2B02G361800.1
|
569
|
cyto
|
TaHAG3D
|
TraesCS2D02G341600.1
|
569
|
cyto
|
MYST
|
TaHAG4A
|
TraesCS2A02G159700.1
|
438
|
nucl, cyto
|
TaHAG4B
|
TraesCS2B02G185300.1
|
482
|
nucl, mito
|
TaHAG4D
|
TraesCS2D02G166900.1
|
438
|
nucl, cyto
|
CBP
|
TaHAC1A
|
TraesCS3A02G524800.1
|
1286
|
nucl
|
TaHAC1B
|
TraesCS3B02G592100.1
|
1288
|
nucl
|
TaHAC1D
|
TraesCS3D02G530000.2
|
1286
|
nucl
|
TaHAC2A
|
TraesCS2A02G039500.3
|
1186
|
nucl
|
TaHAC2B
|
TraesCS2B02G052300.4
|
1186
|
nucl
|
TaHAC2D
|
TraesCS2D02G038100.1
|
1185
|
nucl
|
TaHAC4A
|
TraesCS7A02G414500.1
|
1518
|
nucl
|
TaHAC4B
|
TraesCS7B02G314400.2
|
1512
|
nucl, pero
|
TaHAC4D
|
TraesCS7D02G407600.2
|
1518
|
nucl, pero
|
TaHAC5A
|
TraesCS6A02G107300.1
|
1726
|
nucl
|
TaHAC5B
|
TraesCS6B02G135800.1
|
1726
|
nucl
|
TaHAC5D
|
TraesCS6D02G095400.1
|
1728
|
nucl
|
TAFII250
|
TaHAF1A
|
TraesCS7A02G515000.1
|
1782
|
chlo
|
TaHAF1B
|
TraesCS7B02G431700.2
|
1762
|
nucl
|
TaHAF1D
|
TraesCS7D02G505400.1
|
1762
|
nucl
|
TaHAF2A
|
TraesCS7A02G514800.1
|
1796
|
nucl
|
TaHAF2B
|
TraesCS7B02G431500.1
|
1796
|
nucl
|
TaHAF2D
|
TraesCS7D02G505200.1
|
1796
|
nucl
|
HDACs family
|
RPD3/HDA1
|
TaHDA2A
|
TraesCS7A02G362600.1
|
355
|
chlo, cyto
|
TaHDA2B
|
TraesCS7B02G266000.1
|
353
|
chlo
|
TaHDA2D
|
TraesCS7D02G360500.1
|
353
|
chlo
|
TaHDA5A
|
TraesCS1A02G317100.1
|
397
|
chlo
|
TaHDA5B
|
TraesCS1B02G329500.1
|
390
|
cyto, chlo
|
TaHDA5D
|
TraesCS1D02G317100.1
|
394
|
chlo
|
TaHDA6A
|
TraesCS6A02G181100.1
|
458
|
cysk
|
TaHDA6B
|
TraesCS6B02G210200.1
|
458
|
cysk, cyto
|
TaHDA6D
|
TraesCS6D02G168400.1
|
458
|
cysk
|
TaHDA7A
|
TraesCS6A02G184100.2
|
519
|
nucl
|
TaHDA7B
|
TraesCS6B02G212600.3
|
520
|
nucl
|
TaHDA7D
|
TraesCS6D02G171000.1
|
574
|
chlo, nucl
|
TaHDA8A
|
TraesCS1A02G275300.1
|
391
|
chlo
|
TaHDA8B
|
TraesCS1B02G284500.1
|
393
|
chlo
|
TaHDA8D
|
TraesCS1D02G274900.1
|
366
|
chlo
|
TaHDA9A
|
TraesCS2A02G293200.1
|
430
|
cyto
|
TaHDA9B
|
TraesCS2B02G309700.1
|
430
|
cyto
|
TaHDA9D
|
TraesCS2D02G291000.1
|
430
|
cyto
|
TaHDA14A
|
TraesCS5A02G119300.2
|
444
|
chlo
|
TaHDA14B
|
TraesCS5B02G121300.1
|
453
|
chlo
|
TaHDA14D
|
TraesCS5D02G126600.1
|
444
|
chlo
|
TaHDA15A
|
TraesCS5A02G065300.1
|
614
|
nucl
|
TaHDA15B
|
TraesCS5B02G072100.1
|
614
|
nucl
|
TaHDA15D
|
TraesCS5D02G076100.1
|
612
|
nucl
|
TaHDA18A
|
TraesCS2A02G177100.1
|
693
|
chlo, nucl
|
TaHDA18B
|
TraesCS2B02G204100.1
|
693
|
chlo, nucl
|
TaHDA18D
|
TraesCS2D02G185200.1
|
693
|
nucl, chlo
|
TaHDA19A
|
TraesCS7A02G365600.3
|
523
|
nucl
|
TaHDA19B
|
TraesCS7B02G261800.1
|
519
|
nucl
|
TaHDA19D
|
TraesCS7D02G356800.1
|
519
|
nucl
|
TaHDA20A
|
TraesCS4A02G213200.1
|
471
|
mito, nucl
|
TaHDA20B
|
TraesCS4B02G102600.1
|
471
|
mito, nucl
|
TaHDA20D
|
TraesCS4D02G100000.1
|
471
|
cyto, nucl
|
TaHDA21A
|
TraesCS5A02G295000.1
|
484
|
cyto
|
TaHDA21D
|
TraesCS5D02G302400.1
|
495
|
cyto
|
TaHDA22B
|
TraesCS3B02G318000.1
|
380
|
cysk
|
TaHDA22D
|
TraesCS3D02G422300.1
|
327
|
cyto
|
HD2
|
TaHDT1A
|
TraesCS1A02G445700.4
|
309
|
nucl
|
TaHDT1D
|
TraesCS1D02G454400.2
|
311
|
nucl
|
TaHDT2A
|
TraesCS3A02G415200.1
|
403
|
nucl
|
TaHDT2B
|
TraesCS3B02G450300.1
|
383
|
nucl
|
TaHDT2D
|
TraesCS3D02G410300.2
|
364
|
nucl
|
TaHDT3B
|
TraesCS3B02G450400.1
|
378
|
nucl
|
TaHDT3D
|
TraesCS3D02G410400.1
|
432
|
nucl
|
TaHDT4A
|
TraesCS5A02G158900.1
|
432
|
nucl
|
TaHDT4B
|
TraesCS5B02G156700.1
|
433
|
nucl
|
TaHDT4D
|
TraesCS5D02G164000.1
|
433
|
nucl
|
SIR2
|
TaSRT1A
|
TraesCS2A02G077800.1
|
440
|
cyto, nucl
|
TaSRT1B
|
TraesCS2B02G092700.1
|
465
|
cyto, nucl
|
TaSRT1D
|
TraesCS2D02G075800.1
|
678
|
nucl, E.R
|
TaSRT2A
|
TraesCS5A02G114700.3
|
414
|
cyto
|
TaSRT2D
|
TraesCS5D02G124700.1
|
396
|
chlo
|
TaSRT2U
|
TraesCSU02G136000.1
|
396
|
chlo
|
Phylogenetic and conserved domain analyses of HATs and HDACs in wheat
To reveal the evolutionary relationships among HATs and HDACs in wheat, a phylogenetic tree was constructed using MEGA 6.0 based on the amino acid sequences (Table S1) for the newly identified HAT and HDAC proteins in wheat and previously identified HATs from Arabidopsis thaliana and rice. Similar to Arabidopsis and rice, wheat HATs could be grouped into four distinct subfamilies: 12 HATs belonged to the CBP subfamily, 9 HATs belonged to the GNAT subfamily, 3 HATs belonged to the MYST subfamily, and 6 HATs belonged to the TAFII250 subfamily (Fig. 1, Table 1). The 53 HDACs in wheat could be classified into three subfamilies, RPD3/HDA1, HD2, and SIR2, with 37, 10, and 6 loci, respectively (Fig. 2, Table 1). HATs and HDACs in wheat were named based on the nomenclature suggestions for Arabidopsis; each gene was assigned a two-letter code corresponding to T. aestivum (Ta), followed by family designation and number, followed by A, B, or D (according to the subgenome in wheat).
In an analysis of domain architectures, all TaHAT subfamilies in wheat had conserved domains; for example, the CBP subfamily of wheat TaHATs contained the HAT-KAT11 domain, the GNAT subfamily of TaHATs contained the Hat1_N or Acetyltransferase domain, the MYST subfamily contained MOZ_SAS, zf-MYST, and Tudor-knot domains, and the TAFII250 subfamily contained DUF3591 and Bromodomain. In addition to these highly conserved domains, the CBP subfamily also contained the PHD, ZZ, and zf-TAZ domains, the GNAT subfamily had the Radical_SAM domain and Bromodomain, and the TBP-binding and ubiquitin domain was found in the TAFII250 subfamily (Fig. 1). For HDACs of wheat, RPD3/HDA1, HD2, and SIR2 subfamilies had the conserved domains Hist-deacetyl, NPL, and SIR2, respectively. In addition, TaHDT2 and TaHDT4 in the HD2 subfamily contained zf-C2H2_6 and FKBP_C domains, respectively. TaSRT2A in SIR2 subfamilies contained the Fibrillarin domain (Fig. 2). In general, wheat HATs and HDACs had similar domain organizations to those of their counterparts in Arabidopsis and rice.
Genomic localization of TaHATs and TaHDACs
The newly identified wheat HATs and HDACs were mapped to chromosomes. Both TaHATs and TaHDACs were unevenly distributed along the chromosomes (Fig. 3, Figure S1). In particular, there were no TaHAT genes on chromosomes 4A/B/D and 1B, three TaHATs were located on chromosomes 2A/B/D and 7 A/B/D, and the remaining chromosomes had only a single TaHAT gene. However, TaHDACs were distributed across all chromosomes, with the greatest number on chromosomes 5A/D (five HDACs) and only one TaHDAC gene on chromosomes 3A, 4A/B/D, and Un. In terms of all TaHATs and TaHDACs, most were found on chromosomes 2A/B/D and 5 A/D, and the fewer were found on chromosomes 4A/B/D.
Putative cis-regulatory elements in the promoter regions of TaHATs and TaHDACs
To gain more insight into the putative functions of TaHATs and TaHDACs, the promoter region (1500 bp upstream of the transcription start site) was scanned using the PlantCARE database. Many putative cis-regulatory elements were detected in the promoters of both TaHATs and TaHDACs (Fig. 4; Table S2; Table S3), such as ABRE (abscisic acid-responsive element), STRE (stress-responsive element), ARE (essential for anaerobic induction), CCGTCC-box (meristem-specific activation), G-Box (light responsiveness), MYB (MYB-related binding sites), and TGA-element (auxin-responsive element). Most TaHATs and TaHDACs had ABRE (25 HAT genes and 43 HDAC genes) or STRE (27 HAT genes and 46 HDAC genes) elements. However, only TaHDACs had the GARE-motif (gibberellin-responsive element), indicating that the transmission and regulation of GA may be more closely related to histone deacetylation. Moreover, in TaHATs, the number of ABRE elements was higher in all genes in the GNAT subfamily as well as TaHAC4A/B/D and TaHAC5A/B/D in the CBP subfamily than in the TAFII250 subfamily, which contained few or no ABRE elements. Genes with a large number of ABRE elements in HDACs were TaHDA5A/B/D, TaHDA8A/B/D, and TaHDA9B/D in the RPD3/HDA1 subfamily, TaHDT2A/B in the HD2 subfamily, and TaSRT1D and TaSRT2U in the SIR2 subfamily. These genes may mediate the ABA signaling pathway. The TGA-element was only detected in GNAT subfamily genes, such as TaHAG1A/B/U and TaHAG2A/B/D, and was not observed in genes of the CBP and MYST subfamilies (except TaHAC1D). In general, the distribution of cis-acting elements was more similar in TaHAT subfamilies than in TaHDACs subfamilies.
Expression analysis of TaHATs and TaHDACs in different tissues
The RNA-seq data for different tissues were obtained from expVIP [36] (Fig. 5). All TaHATs and TaHDACs were differentially expressed in the leaf, root, spike, and grain. TaHAG4A/B/D was expressed in all four tissues and showed the highest expression levels among TaHATs. TaHAC1A/B/D and TaHAF1B/D had very low or no expression in these four tissues. The expression levels of TaHAG2A/B/D in the leaf and TaHAC4B/D, TaHAC5A/B, and TaHAF2A/B in the grain were nearly undetectable (Fig. 5A). TaHDT1D was expressed in all four tissues and showed relatively higher expression levels than those of other TaHDACs, while TaHDA20A/B/D, TaHDA21A/D, TaHDA22D, TaHDT1A, TaHDT2A, and TaHDT3B/D were almost undetectable in the four tissues (Fig. 5B). Additionally, the expression levels differed among A, B, and D genomes, e.g., TaHDA19A was not expressed in the leaf, root, or spike, while TaHDA19B/D were highly expressed. TaHDA22B was highly expressed in all four tissues, unlike TaHDA22D (Fig. 5B). These results suggested that the A, B, and D genomes may jointly contribute to their functional roles.
Expression analysis of TaHATs and TaHDACs in response to drought stress
The identification of putative cis-regulatory elements in the promoter regions suggested that TaHATs and TaHDACs contribute to the response to abiotic stresses. In the main wheat-producing area, plants often encounter drought, leading to reduced yields. Therefore, we focused on the expression of TaHATs and TaHDACs under drought stress. Drought resistance is significantly higher in the wheat variety BN207 than in its parents BN64 and ZM16 (Table S4). Therefore, we used these varieties to identify TaHATs and TaHDACs that may contribute to the response to drought stress by comparative expression analyses.
Similar to the tissue expression results, TaHDA5, TaHDA20, TaHDA21, and TaHDT3 were not detected in the leaf by qRT-PCR. For the remaining TaHATs and TaHDACs, regardless of conditions (i.e., normal or drought), a number of TaHATs and TaHDACs were significantly differentially expressed among the three varieties. However, the expression levels of TaHAG4 and TaHAF1 in the TaHAT family and TaHDA7, TaHDA15 and TaSRT2 in the TaHDACs family were not affected or were slightly affected by drought stress in all three varieties (Fig. 6). This indicated that these genes may not be related to the regulation of drought stress. All other genes were up-regulated or down-regulated in at least one variety under drought stress. It is worth noting that TaHAG2, TaHAG3, and TaHAC2 in the TaHAT family (Fig. 6A) and TaHDA2, TaHDA18, TaHDT1, and TaHDT2 in the TaHDAC family (Fig. 6B) showed a significant response to drought stress only in BN207; in particular, TaHAG2, TaHAG3, TaHAC2, and TaHDT1 were up-regulated under drought stress, while TaHDA2, TaHDA18, and TaHDT2 were down-regulated. Further, the expression of these genes (except TaHDA2) in BN207 under drought stress was significantly different from levels in its parents BN64 and ZM16. Therefore, combined with the observation that BN207 had higher drought resistance than that of its parents BN64 and ZM16, our results suggested that these six genes were likely to mediate drought stress in wheat.