Identification and phylogenetic analysis of LIM gene in wheat genome
LIM proteins are widely found in eukaryotes which act as developmental regulators in basic cellular processes such as regulating the transcription or organizing the cytoskeleton [11]. The genome wide identification of LIMs has been conducted in several plant species, such as Oryza sativa, Populus trichocarpa, Arabidopsis thaliana [11], Zea mays [12] and Chenopodium quinoa [14]. In order to get insights into the function of wheat LIMs (TaLIMs), comprehensive analysis of TaLIMs was performed in the present study. We used two methods to identify the Triticum aestivum LIM genes (TaLIMs) from the wheat reference genome (IWGSC v2.1). One is to use the downloaded Pfam domain sequences (PF00412) as seeds to perform BLASTp search against genome. Another one is to use the sequences of the known LIM proteins, including 14 AtLIMs, 9 OsLIMs and 11 ZmLIMs, as the queries to perform the BLASTp search [18]. Results of both methods were combined and redundant sequences were removed. The obtained candidate sequences were further checked by Pfam and InterProScan to excluded sequences that do not contain the domain of LIM. Finally, 28 TaLIMs were identified from wheat genome (Additional file 1: Table S1). In order to determine the evolutionary relationship between TaLIMs and AtLIMs, OsLIMs, and ZmLIMs, a phylogenetic tree containing 62 LIM protein sequences was constructed by multiple sequence alignment using ClustalW2. Based on the phylogenetic relationship, the 62 LIMs were divided into two groups, Group I containing 36 LIMs and Group II containing 26 LIMs (Fig. 1). Among the 28 TaLIMs, 17 have two LIM conserved domain and belong to Group I; 11 belong to Group II and contain one LIM domain at the N-terminal and another new conserved cysteine-rich domain DA1-like at the C-terminal [19]. Those TaLIMs were named by comprehensive consideration of the evolutionary relationships, the character of conserved domains, and the chromosome position. The name of TaLIMs was shown in Additional file 1: Table S1.
Chromosomal mapping, gene duplication and Ka/Ks analysis of TaLIMs
As showed in Fig. 2, chromosome distribution map of TaLIMs was constructed by using MapInspect according to the physical chromosome location extracted from the GFF3 file. The map showed that the 28 identified TaLIMs were unevenly distributed on 18 chromosomes, whereas no TaLIM was located on chromosome 3. The number of TaLIMs in A, B and D sub-genomes were counted. Results showed that there were 8 TaLIMs in A, 10 TaLIMs in B, and 10 TaLIMs in D sub-genome. Gene copy number variation is the key foundation for the plant genome adaptation to the changing environment since each copy of the gene can evolve independently and diversify its functions [20]. Fragment duplication and tandem duplication are generally considered to be the two main causes of gene copy number variation, which further result in the extension of plant gene family [21]. Therefore, it is necessary to detect the duplication events of TaLIMs. Based on the sequence similarity and chromosome position, a total of 26 pairs of fragment duplications were found among 28 TaLIMs, and no tandem duplication event was found. The result suggested that fragment duplication is a major contributor to the expansion of LIM family in wheat during evolution. In order to further decipher the evolutionary history of TaLIMs, we calculated the Ka/Ks values of 26 pairs of TaLIM fragment duplications using TBtools. Generally, the Ka/Ks value (non-synonymous substitution rate/synonymous substitution rate) is widely used to represent the selection pressure and evolutionary speed of gene evolution. Ka/Ks value = 1 represents the neutral selection effect, Ka/Ks value > 1 represents positive selection of evolution acceleration, and Ka/Ks value < 1 represents purifying selection under functional constraints. In this study, the Ka/Ks values of the 26 duplicate pairs were all less than 1 and the average value is 0.11 (Additional file 1: Table S2), indicating that they underwent strong purification selection in the evolutionary process, which may contribute to functional stability.
Protein characterization and homology modeling of TaLIMs
In order to further determine the protein characteristic of TaLIMs, the online tool ExPASY Server10 was used to predict the protein properties of TaLIMs (Additional file 1: Table S1). Results showed that the amino acid (aa) length of TaLIMs ranged from 196 to 1,279 aa. The instability index showed that the majority of TaLIMs were unstable proteins (instability index > 40). Only Group Ⅰ member TaLIM11-6A, TaLIM11-6B, TaLIM11-6D, TaLIM1-1A, TaLIM1-1B, TaLIM1-1D, TaLIM10-5A, TaLIM10-5B, and TaLIM10-5D were predicted to be stable protein (instability index < 40). The TaLIMs isoelectric point varied significantly between 4.57 and 8.63, and there were both acidic and basic proteins. The grand average of hydropathicity (GRAVY) of every TaLIMs is less than 0, suggesting that they are all hydrophilic proteins. Since no signal peptides were detected, it was speculated that all TaLIMs were non-secreted proteins. The subcellular localization prediction showed that all 11 TaLIMs in Group Ⅱ were located in the nucleus. In Group Ⅰ, 12 TaLIMs were located in the nucleus, while TaLIM11-6A, TaLIM7-4B, TaLIM8-4D, TaLIM10-5B, and TaLIM10-5D were located in both nucleus and golgiosome.
The 3D (three dimension) homology modeling of TaLIMs were completed by SWISS-MODEL and results showed that TaLIMs are mainly comprised by α-helix and β-turn and have considerable simple 3D structures, whereas the proportion of these two kinds of structures are different among TaLIM proteins (Fig. 3). Moreover, the proportion of α-helix (20-38%) was greater than that of β-turn (2.5-14%).
Gene structure and motifs analysis of TaLIMs
Structure determines function and exon/intron structural diversity usually play key roles in the evolution of gene families, which can provide additional evidence to support phylogenetic grouping [22]. Therefore, we used the GFF3 file of TaLIMs to analyze their exon/intron structure and was illustrated by GSDS. Results showed that all TaLIMs contained intron. The number of introns ranged from 4 to 12 and the number of exons ranged from 5 to 12. Most of the TaLIMs (23) contained complete UTR region, except for TaLIM5-4A, TaLIM5-4B, and TaLIM5-4D that only contained 3’-UTR region, and TaLIM2-2D and TaLIM8-4D contained no UTR region (Fig. 4b), which may be one of the reasons for the functional differentiation of TaLIMs. Protein conserved functional domain analysis showed that all of the 17 members in Group Ⅰ contain two LIM domains, and all of the 11 members in Group Ⅱ contain one LIM domain and one DA1-like domain, which is consistent with previous studies (Fig. 4a and c) [23]. Protein motif prediction is an effective method to predict protein function [24]. In order to further study structural diversity and predict the function of TaLIMs proteins, 15 motifs were identified using MEME online software and further annotated using Interproscan5 (Fig. 5b). Detailed information of 15 motifs including specific amino acid sequences of each motif were showed in Additional file 1: Table S4. Meanwhile, WebLogo was used to generate a conservative design logo icon (Fig. 5c). Results showed that TaLIMs closely clustered in phylogenetic tree (Fig. 5a) have similar motif arrangements and positions, suggesting that members in same tree branch may have similar biological function. All the TaLIMs contained motif1 (LIM-domain) and motif6. The function of motif6 is unknown. Group Ⅰ members specifically contained motif3 (LIM-domain), LIM-domain containing proteins have been shown to play roles in cytoskeletal organization and organ development [9, 10]. Group Ⅱ members specifically contained motif4, 5, 7, 8, which comprise the DA1-like conserved domain. But the function of DA1-like domain is unknown. These results showed that the motifs between Group I and II members were different, while the members belong to same Group contain similar motifs, which imply that the functions of different group members might be related to specific motif [25].
Analysis of cis-acting element for TaLIMs
Transcription factors are important in the regulation of plant growth and development, hormones metabolisms and stress responses. While cis-elements existed in the promoter region play important roles in the transcriptional level of genes in response to multiple stimuli [1]. Therefore, the analysis of promoter cis-acting elements is important in the study of genes function [26]. In this study, we identified 48 kinds of cis-acting elements in the upstream 1.5 kb region of 28 TaLIMs and these cis-acting elements were divided into three groups: 28 cis-acting elements belong to “biological/abiotic stress”, 11 belong to “growth and development”, and 9 belong to “phytohormone response”, respectively (Fig.6). The detailed information of these identified cis-acting elements including function and location were displayed in Additional file 1: Table S5. Results showed that the promoter sequences of TaLIMs have various cis-regulatory elements, such as anaerobic-response element (GC-motif and ARE), low temperature responsive element (LTR), cis-acting regulatory element involved in light responsiveness (G-box) and MYB binding site involved in drought-inducibility (MBS), which were related to biotic/abiotic stresses.
Abscisic acid (ABA) plays a vital role in plant growth and development as well as in mediating plant response to a wide range of stresses. The abscisic acid response element (ABRE) was identified in most of TaLIMs (17). Moreover, the other cis-elements involved in osmotic stress, such as MBS and TC-rich repeats, were also observed in TaLIMs promoters. This result suggested that LIM genes in wheat may be regulated by various factors, including drought and ABA, which need to be experimentally proved in further studies.
Transcriptomic profiling revealed various expression patterns of TaLIMs
For multigene family, analyses of gene expression patterns often provided useful clues for determining gene function [27]. The transcriptome data sets downloaded from NCBI were divided into three categories according to biotic stress, abiotic stress, and growth and development (Fig. 7, Additional file 1: Table S6). Expression patterns analysis showed that TaLIMs have variable expression patterns. The expression levels of TaLIM7-4B and TaLIM8-4D were generally high throughout the growth period and under various biotic and abiotic stress conditions. Under various biotic stress including Zymoseptoria tritici, powdery mildew, stripe rust, and Fusarium graminearum, the expression levels of 17 TaLIMs were highly induced. Whereas the expression levels of TaLIM1-1A, TaLIM1-1B, TaLIM1-1D, TaLIM2-2A, TaLIM2-2B, TaLIM2-2D, TaLIM11-6A, TaLIM11-6B, and TaLIM11-6D were only highly induced under the Fusarium graminearum treatment. As shown in Fig. 7, under 17 different growth and development stages, the expression levels of 19 TaLIMs were highly induced. But TaLIM1-1A, TaLIM1-1B, TaLIM1-1D, TaLIM11-6A, TaLIM11-6B, and TaLIM11-6D, which had high expression only in anther-anthesis stage and stigma & ovary-anthesis stage. TaLIM2-2A, TaLIM2-2B, and TaLIM2-2D were highly expressed only in the anther-anthesis stage. Under abiotic stress, most of the genes were expressed, whereas the expression levels were low (the expression levels showed in Additional file 1: Table S6). TaLIM2-2A and TaLIM11-6D were not expressed, while TaLIM2-2B, TaLIM2-2D, TaLIM11-6A, and TaLIM11-6B were expressed only under certain treatments.
Evolutionary conserved TaLIMs were family member expanded by polyploidization
Previous researches showed that the evolutionary history of common wheat is quite complex, and its ancestral origin is influenced by many factors. But it is commonly believed that wild diploid wheat (Triticum urartu) crossed with the B genome ancestor that is the closest relative of goat grass (Aegilops speltoides) to produce wild emmer wheat (Triticum dicoccoides) [28-30]. In order to further understand the evolutionary relationship and homology of the LIM genes among allohexaploid common wheat (AABBDD) and ancestral species, we used BLASTp to identify homologous genes from T. urartu (AA), Aegilops tauschii (DD) and T. dicoccoides (AABB), which resulted in the detection of 8, 10, and 20 homologous genes, respectively (Fig. 8a). Therefore, T. aestivum inherited more LIM genes from T. dicoccoides than T. urartu and Ae. tauschii. Furthermore, we speculate that the expansion of TaLIMs may derived from ancestors and expanded by polyploidization after the crossing between ancestral species. To further confirm this speculation, we identified the homologous gene pairs between T. aestivum and ancestral species, resulting in the detection of 4, 11, and 36 homologous pairs between T. aestivum and T. urartu (Ta vs Tu), T. aestivum and Ae. tauschii (Ta vs Ae), and T. aestivum and T. dicoccoide (Ta vs Td), respectively (Fig. 8b). Among the 51 homologous gene pairs, the Ka/Ks ratios for 49 pairs were less than 1 (Fig. 8c), implying that most duplicated gene pairs underwent purification selection pressure, which helps their function to be maintained and reflects that they do not have much divergence during evolution. To further validate the conservation of LIMs, we identified homologous genes in the Gramineae family and calculate their Ka/Ks. The result showed that 41 pairs of homologous genes among wheat, maize, and rice were identified. The Ka/Ks analysis showed that ratios for homologous pairs were less than 1 and the average value is 0.14 (Fig. 8d, Additional file 1: Table S7), implying that the evolution of LIM genes were conserved in Gramineae family species. The above results indicate that LIM genes were not only conserved in the evolution of wheat, but also highly conserved in grass plants.
Expression analyses of TaLIMs
In order to further understand the role of TaLIM genes in response to biological/abiotic stress, the log2-transformed TPM values of the RNA-seq transcriptomic dataset were mined (Fig. 7). Furthermore, several highly expressed (TaLIM8-4D) and/or differentially expressed (TaLIM1-1A, TaLIM3-2B and TaLIM10-5D) LIM genes were selected to analysis their expression patterns in responding to Fusarium graminearum, heat, drought, ABA, and NaCl stresses using qRT-PCR (Fig. 9).
In the pollen, under heat treatment, the expression levels of TaLIM1-1A and TaLIM3-2B were down-regulated, with the expression levels being the lowest after 48 h. The expression levels of TaLIM8-4D and TaLIM10-5D were up-regulated after 12 h, 24 h, 48 h and 72 h of treatment. The expression levels of TaLIM10-5D were up-regulated after 24 h, 48 h and 72 h of treatment, but the expression level was decreased after 12 h of treatment. Under drought treatment, the expression levels of TaLIM1-1A, TaLIM3-2B and TaLIM10-5D in the pollen were down-regulated. The expression levels of TaLIM8-4D were up-regulated after 12 h, 24 h and 72 h of treatment, but after 48 h of treatment, the expression level was down-regulated compared with control.
In the leaf, under ABA treatment, the expression levels of TaLIM1-1A were up-regulated after 2 h of treatment, but down-regulated after 6 h, 12 h, 24 h and 48 h of treatment. The expression levels of TaLIM8-4D and TaLIM10-5D were down-regulated after 2 h, 6 h, 24 h and 48 h of treatment, but up-regulated after 12 h of treatment. The expression levels of TaLIM3-2B were up-regulated after 2 h, 6 h, 12 h and 48 h of treatment, but down-regulated after 24 h of treatment.
In the leaf, under NaCl treatment, the expression levels of TaLIM1-1A were up-regulated after 2 h of treatment, but down-regulated after 6 h, 12 h, 24 h, and 48 h of treatment. The expression levels of TaLIM3-2B were up-regulated after 2 h and 24 h of treatment, but down-regulated after 6 h, 12 h and 48 h of treatment. The expression levels of TaLIM8-4D were down-regulated after 2 h, 6 h and 48 h of NaCl treatment, but up-regulated after 12 h and 24 h of treatment. The expression levels of TaLIM10-5D were down-regulated after 2 h, 6 h and 24 h of treatment, but up-regulated after 12 h and 48 h of treatment.
In the coleoptiles, under the treatment of Fusarium graminearum, the expression levels of TaLIM1-1A, TaLIM3-2B, TaLIM8-4D and TaLIM10-5D were down-regulated after 1 d, 2 d and 3 d of treatment, but up-regulated after 5 d of treatment. Furthermore, the expression levels of TaLIM1-1A and TaLIM3-2B were lowest after 2 d of treatment.
GO annotation and subcellular localization of TaLIMs
TaLIMs functions were predicted by Gene Ontology (GO) annotation. As shown in Fig. 10a (detailed information listed in Additional file 1: Table S8), all TaLIMs (28) were annotated under GO terms “zinc ion binding” (GO:0008270). Half of TaLIMs (14) were annotated under “actin filament binding” (GO:0051015) and “actin filament bundle assembly” (GO:0051017). The results implying TaLIMs may be an actin binding protein that are involved in various developmental activities and cellular processes in plants as suggested by previous studies [31]. Moreover, some triplets of TaLIMs have the same function annotation; such as TaLIM10-5A, TaLIM10-5B and TaLIM10-5D were annotated under “sequence-specific DNA binding transcription factor activity”, “plasma membrane”, “regulation of transcription, DNA-templated” and “membrane fusion”. TaLIM5-4A, TaLIM5-4B and TaLIM5-4D were annotated under “phloem development” and “root development”. Consistently with the function annotation, the subcellular localization prediction showed that 28 TaLIM proteins were localization in the nucleus, while TaLIM11-6A, TaLIM7-4B, TaLIM8-4D, TaLIM10-5B and TaLIM10-5D were predicted to be located in the nucleus and golgiosome. In order to verify subcellular localization prediction and reveal the possible functions of TaLIMs, we constructed the GFP fusion protein expression vector pART-27:TaLIM8-4D:GFP to do the subcellular analysis. Compared to the control (Fig.10b), TaLIM8-4D is localization in the nucleus (Fig. 10c). These results suggest that TaLIM8-4D may play a biological role in the nucleus.
TaLIM8-4D promoted the infection of pathogen
To preliminary test the biological function and reveal whether TaLIM8-4D participates in the pathogenesis process of plant to pathogen, this gene was cloned and transiently expressed in Nicotiana benthamiana using Agrobacterium-mediated expression followed by a hemi-biotrophic pathogen Phytophthora infestans infection. As shown in Fig. 11a and b, the result suggested TaLIM8-4D promoted the colonization of P. infestans, implying TaLIM8-4D function as a susceptible gene. Meanwhile, weak cell death was observed in N. benthamiana leaf after 5 days transient expression, indicating that TaLIM8-4D is a cell death inducer (leave cell death as shown in Fig. 11c).
Pathogenesis-related (PR) genes were up-regulated by TaLIM8-4D
TaLIM8-4D was transiently overexpressed in N. benthamiana leaves to explore the influence of TaLIM8-4D on the expression of pathogenesis-related (PR) genes. The expression of three PTI related genes (NbWRKY7, NbWRKY8 and NbACRE31), one reactive oxygen related gene (NbRbohA), three SA response genes (NbPR1a, NbPR1b and NbPR2), and three JA-dependent immunity genes (NbPR3, NbPR4, NbLOX) were analyzed. As shown in Fig. 12, PR genes, such as NbPR1b, NbACRE31, NbWRKY7 and NbRbohA, were up-regulated at 1 to 4 d in TaLIM8-4D overexpression plant, but down-regulated at 5 d. The expression of NbPR3, NbPR4, NbPR2 and NbLOX were up-regulated at specific time point. For example, NbLOX and NbPR2 were down-regulated at 1 d, but up-regulated at 2 to 5 d. NbPR4 was up-regulated at 1 to 3 d, but down-regulated at 4 to 5 d; NbPR3 was down-regulated at 1d and 5 d, but up-regulated at 2 to 4 d. These results demonstrate the TaLIM8-4D genes could trigger the expression of PR genes to activate the hypersensitive response (Fig. 11c), which might be the reason why the larger disease lesion was observed in TaLIM8-4D transgenic plant after pathogen infection as shown in Fig. 11a and b. Integrating considering the results of qRT-PCR and overexpression of TaLIM8-4D, it was speculated that TaLIM8-4D was function as susceptible gene in nucleus by up-regulating PR genes and inducing cell death to promote the colonization of hemi-biotrophic agent Fusarium graminearum. However, this function mechanism needs to be further confirmed.