The foxtail millet genome contains 94 AAT genes, which all with transmembrane regions
Firstly, 104 putative AAT transcripts were identified in foxtail millet based on local BLASP. However, 94 high-confidence non-redundant AAT genes were confirmed after screening the conservative domains by HMMER, CDD and Interpro databases and removing the different transcripts of the same gene and sequences containing incomplete conserved domains, which was basically at the same level as other gramineous species besides hexaploid wheat (Table 1; Additional file 5: Table S1). Except for tetraploid soybean, AAT gene family in foxtail millet was expanded than that in Arabidopsis and potato. Overall, the number of AAT genes in monocots was higher than that in dicots. The AAT genes identified in foxtail millet were renamed according to their chromosomal localizations and phylogenetic relationships with other species. The length of AAT proteins in foxtail millet ranged from 311 to 984aa, with molecular weight (Mw) varing from 4.97 to 107.5 kD, and isoelectric point (pI) ranged from 4.94 to 9.99 (Additional file 5: Table S1). Different subfamilies of AAT genes showed abundant diversity in subcellular localization, with the subfamily members of AAP, LHT, GAT, AUX and ProT all located on plasma membrane, while those of TTP, ACT, ANT and ATLb all located on vacuole membrane. Some members of the same subfamily showed different subcellular localization, such as, different genes of the ATLa subfamily located on both membranes of plasma and vacuole, while that of the CAT and PHS subfamilies located on the three membranes of the plasma, vacuole and chloroplast.
Table 1
Comparison on the gene abundance of twelve subfamilies of AAT genes in 6 monocots and 3 eudicots.
| | Monocots | | Eudicots |
| | Milleta | Sorghum | Maize | Brachypodium | Rice | Wheat | | Soybean | Arabidopsis | Potato |
AAAP | AAP | 20(21.27%) | 17(22.07%) | 24(22.43%) | 19(23.75%) | 19(22.35%) | 66(22.30%) | | 35(18.52%) | 8(12.70%) | 8(11.11%) |
| LHT | 12(12.77%) | 8(10.39%) | 15(14.42%) | 8(10.00%) | 6(7.05%) | 24(8.11%) | | 24(12.70%) | 10(15.87%) | 11(15.28%) |
| GAT | 6(6.38%) | 4(5.19%) | 2(1.92%) | 3(3.75%) | 4(4.71%) | 14(4.73%) | | 19(10.05%) | 2(3.17%) | 3(4.17%) |
| ProT | 1(1.06%) | 1(1.30%) | 2(1.92%) | 2(2.50%) | 3(3.53%) | 9(3.04%) | | 7(3.70%) | 3(4.76%) | 4(6.35%) |
| AUX | 4(4.25%) | 5(6.49%) | 6(5.60%) | 3(3.75%) | 5(5.88%) | 15(5.07%) | | 16(8.47%) | 4(6.35%) | 5(7.94%) |
| ATLa | 6(6.38%) | 6(7.79%) | 7(6.54%) | 6(7.50%) | 7(8.24%) | 18(6.08%) | | 16(8.47%) | 5(7.94%) | 8(11.11%) |
| ANT | 2(2.13%) | 4(5.19%) | 3(2.8%) | 5(6.25%) | 4(4.71%) | 18(6.08%) | | 6(3.17%) | 4(6.34%) | 5(7.94%) |
| ATLb | 14(14.89%) | 8(10.39%) | 17(15.89%) | 7(8.75%) | 10(11.76%) | 40(13.51%) | | 30(15.87%) | 10(15.87%) | 8(11.11%) |
APC | ACT | 8(8.51%) | 5(6.49%) | 7(6.54%) | 6(7.50%) | 7(8.24%) | 21(7.09%) | | 7(3.70%) | 1(1.59%) | 1(1.39%) |
| CAT | 12(12.77%) | 11(14.29%) | 14(13.08%) | 11(13.75%) | 11(12.94%) | 31(10.47%) | | 19(10.05%) | 9(14.29%) | 9(12.5%) |
| PHS | 8(8.51%) | 7(9.09%) | 7(6.54%) | 7(8.75%) | 9(10.59%) | 31(10.47%) | | 9(4.76%) | 5(7.94%) | 8(11.11%) |
| TTP | 1(1.06%) | 1(1.30%) | 3(2.80%) | 3(3.75%) | 0(0.00%) | 9(3.04%) | | 1(0.53%) | 2(3.17%) | 2(2.78%) |
Total | | 94 | 77 | 107 | 80 | 85 | 296 | | 189 | 63 | 72 |
a The numbers represent the number of identified AAT subfamily members, and the percentages in the brackets represent the proportion to all AAT genes. |
The number of predicted TM regions in SiAATs varied in different subfamilies, which ranged from 7 to 14 (Fig. 1a; Additional file 5: Table S1). For example, all AUXs contained 10 TM regions, ATLas contained 10 or 11 TM regions, while CAT and PHS subfamilies ranged from 9 to 14 and 7 to 12, respectively.
Duplication events occurred in 62% of SiAAT genes and promoted their expansion in foxtail millet
Here, 93 of the 94 SiAATs were mapped to 9 chromosomes. The remaining one was mapped to an unassembled scafford (Additional file 1: Figure S1), which were unevenly distributed, with 18 on chromosome 7 and 2 on chromosome 2 as the most and least, respectively. The apparent regional enrichment of SiAATs on some chromosomes was observed on their distributions. For instance, a large number of SiAATs were located on the ends of chromosomes 1, 5, and 7. Moreover, some SiAATs were mainly concentrated at the fronts of chromosomes 6 and 8 (Additional file 1: Figure S1).
Of the 94 AAT genes in foxtail millet, 58 (62%) were involved in gene duplication events, including 36 tandem and 25 segmental duplications, of which SiATLb12, SiLHT2 and SiLHT9 were involved in both duplications (Additional file 1: Figure S1; Additional file 5: Table S1). The 36 tandem duplications were classified into 13 groups, of which 2 groups with 5 genes (TD10, TD11), 1 group with 4 genes (TD3), 2 groups with 3 genes (TD1, TD13), and the remaining 8 groups all with 2 genes (Additional file 5: Table S1). The 25 segmental duplications were classified into 12 groups, except for SD3 containing 3 genes, with other possessing 2 genes. Gene duplication events produced a large number of paralogous AAT genes, thus promoting the significant build-up of the AAT gene family in foxtail millet.
The phylogenetic tree and structure of AAT genes in foxtail millet
The phylogenetic analysis with the AAT protein sequences from foxtail millet (94), rice (85), Arabidopsis (63) and potato (72) showed that they could be divided into twelve genetic groups with high confidence and that from Monocots and Eudicots distributed in the same group (Fig. 2). The 65 SiAAT proteins of the AAAP family were divided into 8 subfamilies of amino acid permeases (AAPs, 20), lysine, histidine transporters (LHTs, 12), GABA transporters (GATs), proline transporters (ProTs, 1), aux transporters (AUXs, 4), amino acid transporter-like a (ATLa, 6), aromatic and neutral amino acid transporters (ANTs, 2) and amino acid transporter-like b (ATLb, 14). The 29 SiAAT proteins of the APC family were clearly distinguished into 4 subfamilies of cationic amino acid transporters (CATs, 12), amino acid/choline transporters (ACTs, 8), tyrosine-specific transporters (TTPs, 1), and polyamine H་-symporters (PHSs, 8) (Fig. 2).
Based on their annotation information, similar exon/intron structures were observed in most of the SiAATs in the same subfamily, such as SiATLa1 and SiATLa5, SiCAT3 and SiCAT10, and SiATLb3 and SiATLb2. Variants were also found in some SiAATs of the same subfamily, such as SiAUX2 and SiAUX3, SiAAP11 and SiAAP18, and SiANT1 and SiANT2 (Fig. 3). The conserved motifs of SiAAT proteins predicted with MEME were highly consistent with the phylogenetic relationship and classification of SiAATs (Fig. 3; Additional file 2: Figure S2). Similar to their gene structures, the presence of the conserved motifs in different subfamilies of SiAAT proteins varied. For instance¸Motif 1, 4, and 7 were widespread in the AAAP family, while Motif 9 existed in the APC family. Some motifs were unique to certain subfamilies, for example, AUX subfamily only possessed Motif 19, and Motif 16 and 18 were only found in the ACT subfamily.
Variations in gene structure, conserved domains and sequences of paralogous AAT genes promoted their functional differentiation
The diversification of paralogous gene is one of the important sources of the functional differentiation of gene families [46–48]. The comparison on the TM regions of SiAATs found that significant variations in the number of TM regions occurred in the same duplicated gene groups, 60% of the duplicated gene groups (15/25) varied in the number of TM regions, which was 77% (10/13) and 42% (5/12) for the tandemly and for the segmentally duplicated gene groups, respectively (Fig. 1b; Table 2).
Table 2
The extensive variations in gene structure, protein structure and sequence of paralogous AAT genes produced by gene duplication in foxtail millet.
Type of duplication | Subfamily | No. of duplicated gene groups | Gene and protein structure variation | Range of Ka/Ks values | Duplicated gene groups |
No. of TM variation | No. of gene structure variation | No. of conserved domain variation |
Tandem duplication | AAP | 4 | 3 | 3 | 2 | 0.12–0.62 | TD1-TD4 |
ATLb | 3 | 3 | 1 | 2 | 0.27–0.47 | TD5-TD7 |
| GAT | 1 | 0 | 1 | 1 | 0.43 | TD8 |
| LHT | 2 | 2 | 2 | 0 | 0.10–0.39 | TD9,TD10 |
| ACT | 1 | 0 | 1 | 1 | 0.18–0.27 | TD11 |
| CAT | 1 | 1 | 1 | 0 | 0.97 | TD12 |
| PHS | 1 | 1 | 0 | 1 | 0.41–0.67 | TD13 |
| Suma | 13 | 10 (77%) | 9 (69%) | 7 (54%) | 0.10–0.97 (0.28) | |
Segmental duplication | ATLa | 2 | 0 | 0 | 2 | 0.10–0.12 | SD1,SD2 |
ATLb | 3 | 2 | 2 | 2 | 0.12–0.40 | SD3-SD5 |
| LHT | 2 | 2 | 1 | 0 | 0.12–0.37 | SD6,SD7 |
| ACT | 1 | 0 | 1 | 0 | 0.10 | SD8 |
| CAT | 4 | 1 | 1 | 3 | 0.14–0.50 | SD9-SD12 |
| Suma | 12 | 5 (42%) | 5 (42%) | 7 (58%) | 0.10–0.50 (0.21) | |
a The data in columns 4, 5 and 6 represents the total number and its proportion (in the brackets) of mutated duplicated gene groups, while that in column 7 represents the range and average (in the brackets) of Ka/Ks values. |
Multi-sequence alignment of SiAAP proteins revealed that their overall similarity was 59.65% with 11 conserved motifs, and their TM regions were highly correlated with the conserved motifs, both in length and amino acid composition (Fig. 4). Motif 1 and 14 both formed the TM 1 and TM 2 regions, TM 4 and part of TM 5 regions included Motif 4. TM regions of 3, 6, 7, 8, 9 and 10 were with Motifs of 8, 6, 2, 7, 5, 13, respectively (Fig. 4). In addition, some conserved motifs were located in the non TM region, such as Motif 11 in the extra-membrane region and Motif 3 in the intra-membrane region (Fig. 4). It was worth noting that some SiAAPs had missing transmembrane regions due to incomplete conserved motifs, such as SiAAP6 and SiAAP9. Therefore, the number variation of TM regions was mainly determined by the presence of different conserved domains. Prediction of the secondary structure of SiAAT proteins found 14 α-helixes and 4 η-helixes structures, with vast majority located in the TM regions to ensure the efficient and stable transmembrane transport of amino acids (Fig. 4).
In terms of gene structure, 56% (14/25) of duplicated gene groups had variation in the number of introns, which was 69% (9/13) of the tandemly ones and 42% of the segmentally ones (Table 2; Additional file 3: Figure S3). In addition, the variations in the conserved motifs were also observed in 14 duplicated gene groups, 7 for each of the two types of duplication (Table 2).
The Ka/Ks values of all paralogous SiAAT gene pairs were less than 1, ranged from 0.1 to 0.97, which suggested that these genes were subject to different levels of purifying selection (Fig. 5). The tandemly duplicated gene pairs had higher Ka/Ks values than the segmental duplications, with the average Ka/Ks values of 0.28 and 0.21, respectively (Table 2). The Ka/Ks values of tandemly duplicated genes of PHS subfamily ranged from 0.41 to 0.67, while that of PHS subfamily only ranged from 0.18 to 0.27. The Ka/Ks values of different subfamilies also showed different degrees of dispersion. For instance, that of SiAAPs ranged from 0.12 to 0.62, while that of SiBATs (ACT subfamily) from 0.18 to 0.27. This similar phenomenon was also observed on the Ka/Ks values of segmentally duplicated genes (Fig. 5). In general, significant variations in gene structure, conserved domains and sequences among the paralogous SiAAT gene groups might greatly promote the functional diversity of the AAT gene family in foxtail millet.
Variations in expression levels of paralogous and orthologous SiAAT genes together drove the functional differentiation of AAT family in foxtail millet
The pattern of expression of all paralogous AAT genes were investigated using the public transcriptome data. Compared with the ancestral genes, according to their expression patterns, the newly duplicated genes could be classified into three main types of new-functionalization, sub-functionalization and non-functionalization, which were all observed in the tandemly duplicated gene groups with SiAAP8/9 (TD2) possessing new functions. SiAAP9 and SiAAP8 were highly expressed during grain development, and in leaves at the filling stage, respectively. SiAAP14/15/16/17 (TD3) and SiCAT1/2 (TD12) showed sub-functionalization, as the expressions of SiAAP16/17 and SiCAT2 were down-regulated significantly. In addition, SiATLb12 and SiLHT3 losing their functions, as they were not expressed in any tissues (Fig. 6a). Though no new function was observed, similar functional differentiation were also observed in the segmentally duplicated gene groups, such as sub-functionalization of SiATLb1/6/10 (SD3), SiBAT1/SiBAT8 (SD8) and SiCAT3/10 (SD9), and non-functionalization of SiATLa1/5 (SD1), SiLHT1/2 (SD6) and SiCAT4/11 (SD10) (Fig. 6b).
There were 62, 51, 52 and 35 orthologous AAT genes found in sorghum, rice, wheat and Arabidopsis, respectively, and their corresponding relationships with foxtail millet were investigated (Fig. 7; Additional file 6: Table S2). The wheat genome was much larger than other species, so it was not displayed in Fig. 7. The collinear relationship between different species was clearly distinguishable and showed good collinearity between chromosome 9 of foxtail millet and chromosome 1 of sorghum, chromosomes 3, 4, 5 of foxtail millet and chromosomes 8, 9, 10 of sorghum, respectively, etc. Moreover, the collinearity of the orthologous AAT genes among different species was consistent with that of the whole genomes. The number of orthologous subfamilies LHTs and LATs in millet and sorghum were significantly more than those in wheat and rice, which suggested that their expansion in millet and sorghum completed later than that in sorghum and wheat (Fig. 8; Additional file 6: Table S2). Compared with Arabidopsis, the AAP subfamily in grass species was significantly expanded. For instance, the orthologous of SiAAP2, SiAAP12 and SiAAP14 only could be found in grass species, which suggested that they might be produced after the differentiation of monocotyledon and dicotyledon.
The expression characteristics of those orthologous AAT genes in root, stem, leaf, inflorescence/spike and grain of foxtail millet, sorghum, wheat, rice and Arabidopsis were compared using their intergrated transcriptome datasets. Owing to the heterohexaploid nature of wheat with almost three homologous copies for each gene, the average expression values of these copies were used. The expression patterns of SiAATs in foxtail millet and their orthologous in other species were generally conservative, especially in the grass species (Fig. 8). The correlation coefficients on the expression patterns of AAT genes in gramineous species were 0.483, 0.470, and 0.481 between foxtail millet with sorghum, wheat and rice, respectively, while it was only 0.21 between foxtail millet and Arabidopsis. Several high-expressed AAT genes in foxtail millet were also highly conserved in other species, such as SiCAT3 and SiATLa5. In addition to the relatively conservative expression patterns of some important orthologous AAT genes among different species, the expression patterns of some genes were differentiated, such as SiATLa6 and SiATLb6 (Fig. 8). Compared with the orthologous genes in Arabidopsis, SiAATs showed greater variation in expression patterns, and several genes changed their tissue specificities, such as SiATLb1 and SiAUX3.
SiAATs showed abundant spatiotemporal expression patterns, and some of them were specifically expressed in developing grains
The expression patterns of SiAATs were analyzed using the RNA-seq data of different tissues at mutiple growth stages collected from the public online database. The heat map was displayed with the normalized log2 (FPKM + 1) values (Fig. 9a). According to the expression patterns, the SiAATs were clustered into three groups. The 8 SiAAT genes in the first group included 7 AAAP family genes and one APC family gene with relatively high expression level, and were stably expressed in almost all tissues at different developmental stages. The 37 SiAAT genes in the second group were expressed in low abundance in most tissues, but expressed explicitly in some tissues, such as SiAAP9 and SiAAP7 were highly expressed in germinating seeds, while SiATLb3, SiAUX3 and SiAUX4 were relatively high expressed in panicle. The 49 SiAAT genes in the third group showed diverse spatiotemporal expression characteristics, such as SiATLa6 and SiATLa3 were highly expressed in the leaves at the seedling stage, but lower in the leaves at filling stage; SiAAP20 was specifically high-expressed in the stem, while SiAAP3, SiBAT7, SiCAT11, SiBAT2, SiLAT7 and SiLAT8 were expressed abundantly in leaf tissues including leaf sheath and mesophyll. In addition, some genes were highly expressed in two or more tissues, such as SiAAP1, SiAAP13 and SiAUX1 in stem and root; SiANT1, SiCAT3 and SiATLb1 were expressed abundantly in organs involved in the entire source-sink circulation including root, stem, leaf and grain. The expressions of the 20 selected SiAATs by qRT-PCR analysis were not completely consistent with that from the transcriptome data, but their expression characteristics and trends were similar, which verified the reliability of the transcriptome data (Fig. 9a; Additional file 4: Figure S4).
The transcriptome analysis of spike RNA-seq data showed that about half of SiAATs were expressed lowly during seed development, while the remaining ones were expressed at relatively high levels at different stages (Fig. 9b). SiAAP2, SiAAP9, SiATLa5, SiLAT5, SiATLb2 and SiBAT1 etc, were highly expressed in spikelets and grains throughout the grain development process (from S1 to S5). SiATLa6, SiAAP8, SiAAP20, SiAAP1, SiLAT6, SiCAT2 and SiCAT3, etc were highly-expressed at the early stage of grain development. SiGAT6 and SiProT1, etc were highly expressed at specific stages. In addition, the numbers of the highly expressed SiAATs during grain development in different subfamilies were counted, as 14 SiAAPs, 8 SiCATs, 5 SiLATs, 5 SiATLas and 3 SiAUXs, which accounted for 70% of total SiAAPs (20), 67% of SiCATs (12), 63% of SiBATs (PHS, 8), 83% of SiATLas (6) and 75% of SiAUXs (4), respectively (Fig. 10).
There were significant differences in multiple grain quality traits between the two foxtail millet genotypes “JG21” and “YG1”. Except for crude fat content, the basic amino acid contents of glutenin, cysteine, alanine, methionine, leucine, tryptophan and serine in “JG21” grain was significantly lower than “YG1” (Fig. 11b). The developing grains (including glumes) of “JG21” and “YG1” at filling stage in 2019 were separated for RNA extraction and RNA-sEq. Finally, 348 differentially expressed genes (DEGs) in the developing grains of these two genotypes were identified, of which 164 and 184 DEGs were down-regulated and up-regulated in “JG21”, respectively. GO analysis showed that DEGs were mainly enriched in multiple terms such as responses to fungus or viruses, ribonuclease activity, stress response, and transmembrane transport of various substances (Fig. 11a). Interestingly, 9 AAT genes were enriched into four terms related to amino acid transport, including 5 SiAAPs, 1 SiANT, 1 SiATLb, 1 SiAUX and 1 SiBAT (Fig. 11c). The differential expression of these SiAATs suggested that these genes might directly affect the formation of grain quality.
Some SiAATs actively responded to various abiotic stresses
The qRT-PCR results of JG21 and the transcriptome data of the foxtail millet cultivars of Yugu1 and Yugu2 showed a relatively consistent trend of a similar response of SiAAT genes to abiotic stress (Fig. 12a). The qRT-PCR analysis of 12 SiAATs in foxtail millet seedlings (15 days after sowing) to simulated drought (20% PEG 6000, 1 h and 5 h) and salt (200 mM NaCl, 1 h and 5 h) revealed that various subfamilies of SiAATs responded differently to abiotic stresses (Fig. 12b). Seven SiAAT genes were up-regulated more than 5 times after drought or salt stress, of which SiANT1 was up-regulated by about 60 times at 5 h after drought stress. SiANT1, SiCAT10 and SiATLa1 were mainly induced by drought stress, and their expression all reached the highest levels at 5 h after drought treatment. Except for SiAUX2 and SiATLb5, the other 7 SiAAT genes reached their highest expression levels at the late stage of salt stress (5 h). SiLHT12, SiATLb1 and SiLAT3 were specifically and strongly induced by long-time salt stress, while SiANT1, SiATLa1 and SiATLb5 had similar expression patterns under both drought and salt stresses.
The response of different SiAATs to abiotic stress also showed temporal specificity. SiAAP3 was up-regulated under 1 h drought stress, while it was up-regulated at 5 h under salt stress. SiLHT12 was down-regulated under 1 h drought and salt stresses, but then was strongly up-regulated at 5 h (Fig. 12b). SiANT1 was continually up-regulated with the increase of drought stress, and was down-regulated under 5 h salt stress. SiATLb5 and SiATLa1 responded at the early and late stages of drought and salt stress, respectively. Further, some SiAATs showed a sustainable response to abiotic stresses; for example, both SiATLb2 and SiATLa1 were continually up-regulated under both salt and drought stresses. These genes might enhance the adaptability of foxtail millet to abiotic stresses through their active responsiveness.