Genome-wide identification
A total of 22 TaADH genes in the wheat genome were further identified by BLAST with 26 ADH genes reported in muskmelon as bait and wheat genome database using ADH domains (PF00107.26, PF08240.12, and PF13602.6) through HMMER software. According to the location distribution of these genes on chromosomes, they were named TaADH1-TaADH22 (Table 1, Table S1). Except for chromosomes 2 and 3, TaADH genes were distributed on all chromosomes, including 6 genes on chromosome 4 and one TaADH gene on chromosomes 5, 6, and 7, respectively. The number of the intron for all 22 TaADH genes ranged from 7 to 9, while that of the exons ranged from 8 to 10. The length of amino acid in TaADH genes ranged from 347 to 415. The range of pI was from 5.68 to 8.2, and the molecular weight was among 34.4-44.2KDa. Through the subcellular localization prediction of these genes, it was found that they were all located in the cytoplasm.
Table 1
Properties and locations of the predicted TaADH proteins in T. aestivum.
Gene ID | NCBI | Gene name | Ta_Chr | Start | End | exon | intron | CDs length | Number of amino acid | pI | Molecular weight | Subcellular localization |
(bp) | Mw/KDa |
TraesCS1A02G370100.1 | | TaADH1 | 1A | 547391832 | 547395833 | 9 | 8 | 1143 | 380 | 6.08 | 41.6 | Cytoplasm. |
TraesCS1A02G370200.1 | | TaADH2 | 1A | 547410788 | 547417410 | 9 | 8 | 1137 | 378 | 5.87 | 40.8 | Cytoplasm. |
TraesCS1B02G389200.1 | | TaADH3 | 1B | 622706402 | 622708971 | 9 | 8 | 1469 | 379 | 6.28 | 41.1 | Cytoplasm. |
TraesCS1D02G376300.1 | ADH3D | TaADH4 | 1D | 452625185 | 452627671 | 9 | 8 | 1484 | 379 | 6.03 | 41.1 | Cytoplasm. |
TraesCS4A02G202100.2 | ADH1A | TaADH5 | 4A | 491715851 | 491719316 | 10 | 9 | 1140 | 379 | 6.15 | 41.0 | Cytoplasm. |
TraesCS4A02G202200.1 | ADH2A | TaADH6 | 4A | 491914927 | 491917719 | 9 | 8 | 1430 | 379 | 5.81 | 40.9 | Cytoplasm. |
TraesCS4A02G202300.1 | ADH2D | TaADH7 | 4A | 492029965 | 492032871 | 9 | 8 | 1718 | 379 | 5.97 | 41.0 | Cytoplasm. |
TraesCS4B02G106300.1 | ADH1D | TaADH8 | 4B | 115556136 | 115560148 | 10 | 9 | 1962 | 379 | 6.03 | 41.0 | Cytoplasm. |
TraesCS4B02G106400.1 | | TaADH9 | 4B | 115845355 | 115848000 | 9 | 8 | 1348 | 376 | 5.91 | 40.5 | Cytoplasm. |
TraesCS4B02G106500.1 | | TaADH10 | 4B | 115879177 | 115881956 | 9 | 8 | 1611 | 379 | 5.9 | 34.4 | Cytoplasm. |
TraesCS4D02G103000.1 | ADH1A | TaADH11 | 4D | 81918232 | 81921969 | 10 | 9 | 1839 | 379 | 6.15 | 41.0 | Cytoplasm. |
TraesCS4D02G103100.1 | | TaADH12 | 4D | 81971499 | 81974375 | 9 | 8 | 1467 | 379 | 5.92 | 40.9 | Cytoplasm. |
TraesCS4D02G103300.1 | | TaADH13 | 4D | 81984987 | 81987448 | 8 | 7 | 1044 | 347 | 6.56 | 37.6 | Cytoplasm. |
TraesCS5A02G193900.1 | | TaADH14 | 5A | 397249660 | 397251898 | 9 | 8 | 1359 | 365 | 5.68 | 39.7 | Cytoplasm. |
TraesCS5B02G189200.1 | | TaADH15 | 5B | 341062699 | 341068539 | 9 | 8 | 1140 | 379 | 5.83 | 40.9 | Cytoplasm. |
TraesCS5D02G196300.2 | | TaADH16 | 5D | 299832208 | 299835185 | 8 | 7 | 1751 | 379 | 5.68 | 41.0 | Cytoplasm. |
TraesCS6A02G386600.1 | | TaADH17 | 6A | 603279456 | 603282956 | 9 | 8 | 1367 | 381 | 6.55 | 40.6 | Cytoplasm. |
TraesCS6B02G425700.1 | | TaADH18 | 6B | 694401891 | 694405637 | 9 | 8 | 1529 | 381 | 6.37 | 40.7 | Cytoplasm. |
TraesCS6D02G371200.1 | | TaADH19 | 6D | 456554723 | 456558968 | 9 | 8 | 1629 | 381 | 6.37 | 40.7 | Cytoplasm. |
TraesCS7A02G322200.1 | | TaADH20 | 7A | 466247186 | 466249779 | 8 | 7 | 1248 | 415 | 8.2 | 44.0 | Cytoplasm. |
TraesCS7B02G223100.1 | | TaADH21 | 7B | 419035483 | 419038096 | 8 | 7 | 1248 | 415 | 8.18 | 44.2 | Cytoplasm. |
TraesCS7D02G319100.1 | | TaADH22 | 7D | 407849007 | 407852341 | 8 | 7 | 2011 | 415 | 8.2 | 44.1 | Cytoplasm. |
Alignment and evolutionary analysis
By comparing protein sequences of these 22 TaADH genes, it was found that most of the residues of these protein sequences were the same. Pfam scanning of the sequences showed that all of these sequences contained the characteristic motifs of ADH (GroES-like domain and Zinc-binding domain) (Fig. 1A), in which the residues of the GroES-like domain were within 35–164 amino acid, and the amino acid residues of the Zinc-binding domain were within 206–340 amino acid. However, the location of amino acid residues of the Zinc-binding domain in TaADH20-TaADH22 was different from those of other genes (marked with a blue box). It inferred that these genes belonged to the ADH family. To examine their evolutionary relationships in wheat and the other plant species: Arabidopsis thaliana (7), Cucumis melo (13), Cucumis sativus (12), Glycine max (3), Hordeum vulgare (1), Lycopersicon esculentum (7), Oryza sativa (1) and Vitis vinifera (8), a phylogenetic tree was constructed by multiple sequence alignment of 22 TaADH proteins using the adjacent linkage of (NJ) method (Fig. 1B). The predicted ADH genes were classified into three groups, namely short-chain ADH, medium-chain ADH and long-chain-ADH. 22 TaADH genes in wheat belonged to medium-chain ADH type. According to the evolutionary relationship, these genes can be divided into 3 subfamilies: Class I contained the largest number of TaADHs (15 genes, TaADH1-9 and TaADH11-15), followed by Class II (4 genes, TaADH10 and TaADH17-19) and III (3 genes, TaADH20-22).
Conservative domain analysis
Through the conservative analysis of TaADH genes in the wheat genome, it was found that the exon of TaADH genes in Class I had 9, and the distribution of intron number was similar; the TaADH genes of Class II had 9 exons, and the position of intron number distribution was similar; the TaADH genes of Class Ⅲ had 8 exons (Fig. 2A). In order to further clarify the protein structure of TaADHs family members in wheat, we identified the conserved motif (Fig. 2B and Fig. S1) using MEME software, and found that the number of motifs in Class I TaADHs protein was 12 (such as Motif 1-7-4-9-2-8-5-11-6-10-3-14). However, the TaADHs protein of Class II has 11 motifs (such as Motif 1-7-4-9-2-8-5-11-6-10-3), and that of Class Ⅲ was different from other motif composition patterns (such as Motif 1-4-7-12-5-13-3). In order to further analyze the functional domains of these proteins, we analyzed the functional structure of these genes (Fig. 2C). The members of wheat TaADH family have highly conserved functional domains, in which Class I TaADHs protein was mainly alcohol_DH_plant, Class II TaADHs protein was mainly alcohol_DH_class_ Ⅲ domain, while Class III TaADHs protein was Zn_ADH10. Generally speaking, these TaADH family members contained the typical structural domain of alcohol_DH.
Chromosomes distribution and synteny analysis
From the distribution of TaADHs on the wheat chromosomes (Fig. 3A and 3B ), it was found that 22 TaADHs family members were mainly distributed on 15 chromosomes, of which there were 3 genes on chromosomes 4A, 4B and 4D, respectively. These genes have tandem replication events. To explore the collinear relationship of the TaADHs in the wheat genome and between the wheat genome and rice genome, collinear analysis was carried out by MCScanX method (Fig. 3A and 3B). We found that there were 17 fragment replication events among members of the TaADHs family in the wheat genome, including three homologous gene pairs on chromosomes 1, 6 and 7, and fragment replication in TaADH15 and TaADH16 of chromosomes 5B and 5D. However, there were 7 fragment replication events (TaADH5-TaADH8, TaADH6-TaADH9, TaADH7-TaADH10, TaADH5-TaADH11, TaADH6-TaADH12, TaADH8-TaADH11, TaADH9-TaADH12) in chromosomes 4A, 4B and 4D, which were related to the tandem replication events on chromosomes 4A, 4B and 4D. A total of 9 pairs of syntenic paralogs were found in wheat and rice genomes (Fig. 3B), in which of TaADH6 corresponds to LOC_Os11g10510.1; TaADH8 and TaADH11 correspond to LOC_Os11g10480.1; TaADH17, TaADH18, and TaADH19 correspond to LOC_Os02g57040.1; TaADH20, TaADH21, and TaADH22 correspond to LOC_Os08g01760.1.
Evolutionary analysis
The non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and Ka/Ks for 64 duplicated pairs were calculated to reveal the selection pressure of wheat TaADH family genes in the process of evolution (Fig. 4A, Table S2). It was found that the Ka/Ks of these duplicated pairs were less than 1, which tended to a pure selection, indicating that the sequence similarity of TaADH genes was very high and relatively conservative in the process of evolution. The evolution time of the duplicated events of TaADH genes can be divided into three evolution periods (Fig. 4B, Table S2), of which 30 copies of TaADH duplication genes occurred about 11.19 to 16.42 million years (Mya), 12 copies of TaADH duplicated gene pairs occurred about 7.73 to 9.56 Mya, and the other 22 copies of TaADH duplicated gene pairs occurred about less than 6 Mya, the time period mostly before the wheat polyploidization event. It showed that although these genes sequences were conserved, they were different in evolutionary time.
The cis-regulatory elements analysis of TaADH genes in wheat
To further identify the cis-regulatory elements located upstream of the TaADH genes, we selected the 2K bp promoter region upstream of the CDS of TaADH genes and used TBtools software to predict and visualize the cis-acting elements (Fig. 5). There were a variety of cis-acting elements in the upstream promoters of these genes, which are responsive to 11 kinds of stress (hormone response, anaerobic response, defense, and stress response, drought induction, light response, low-temperature response, etc.). Except for TaADH13, the upstream promoters of other genes contain elements (ARE), that respond to anaerobic induction, and TaADH6 and TaADH9 contained as many as six cis-regulatory elements of ARE. We also found that the upstream promoter of TaADH4 contains 8 cis-regulatory elements (ABRE) responsive to abscisic acid. The upstream promoter of TaADH3 contains as many as 14 cis-regulatory elements (TGACG-motif and CGTCA-motif) responsive to Me-jasmonic acid.
Tissue-specific expression patterns of TaADH genes in different tissues and organs
In RNA-seq data of different tissues and organs in T. aestivum from, FPKM values of transcript accumulation of 22 TaADH genes were obtained from publicly available expression data sets, and then the corresponding heatmaps of relative expression levels were generated using Heatmap tool. The transcription levels in various T. aestivum tissues, including the roots, leaf, stem, spike, grain, and seeding were examined (Fig. 6). We found that except for TaADH2, there was no expression of TaADH6-7 in leaves, but the expression of TaADH6 was the highest in grain and that of TaADH7 was the highest in stems. The expression pattern of TaADH4 was similar to that of TaADH6, and the expression level was the highest in grains. The TaADH1 and TaADH9 expressions were only detected in grains and roots of wheat, but not in other parts of wheat, while the expression pattern of TaADH15 was opposite to that of TaADH1 and TaADH9. Other genes (TaADH3, TaADH5, TaADH8, TaADH10-11, TaADH14, TaADH16-22) were expressed in all parts of wheat. Among them, the expression of TaADH5, TaADH8, TaADH11, and TaADH17-22 was the highest in wheat grains.
The expression of TaADH genes in wheat seed under waterlogging treatment
To further analyze the response of two wheat seed with different waterlogging tolerance to waterlogging stress during the germination stage, we analyzed the relative expression levels of 22 members of wheat TaADH family (Fig. 7). The results showed that the expression levels of seven TaADH genes (TaADH1/2, TaADH13, TaADH17, TaADH18, TaADH19, TaADH20) in the seeds of the intolerant variety Zhoumai 22 were significantly up-regulated at 24 hours after waterlogging treatment compared with the control treatment, but there was no significant difference in the expression levels of TaADH1/2, TaADH17, TaADH18, TaADH19, and TaADH20 genes compared with the control treatment 72 hours after germination, only the expression level of TaADH13 gene showed an upward trend. Compared with the control treatment, the expression levels of 14 genes (TaADH1/2, TaADH3-6, TaADH8-13, TaADH19, and TaADH20) in the seeds of Bainong 607 were significantly up-regulated at 24 hours after waterlogging treatment, while the expression levels of TaADH1/2, TaADH3, and TaADH9 genes were significantly up-regulated at 72 hours after germination compared with the control treatment, while the expression levels of TaADH5, TaADH6, TaADH14, and TaADH16 genes decreased. The results showed that the difference between waterlogging-tolerant and non-waterlogging-tolerant varieties after waterlogging treatment was closely related to the early and rapid expression of TaADH genes.