Identification of MsEIN3/EIL genes in alfalfa
To identify MsEIN3/EIL gene members in alfalfa, we utilized the full-length protein sequences of EIN3/EIL genes from four species: A. thaliana (6), Medicago truncatula (10), Glycine max (12) and O. sativa (6), as queries for BLASTp search. Eleven putative MsEIN3/EIL genes were identified in the genome of alfalfa. Subsequently, ten hypothetical MsEIN3/EIL proteins were identified through a combination of Hidden Markov Model (HMM) profile and BLAST search results. Further analysis using the CDD database (https://www.ncbi.nlm.nih.gov/cdd) and InterPro website (https://www.ebi.ac.uk/interpro/) led to the identification of ten MsEIN3/EIL genes with EIN3 domain (PF04873), which were named MsEIL1-MsEIL10 according to their chromosomal order (Table S1). The physical and chemical properties of these genes were collected as shown in Table 1. The length of the MsEIN3/EIL proteins ranged from 256 amino acids of MsEIL5 to 810 amino acids of MsEIL1, and the molecular weight varied from 28.7 kDa (MsEIL5) to 91.7 kDa (MsEIL1). Among these proteins, MsEIL3 had the lowest protein isoelectric point at 4.95, while MsEIL6 had the highest at 9.04. Subcellular localization prediction indicated that only the MsEIL5 protein was localized in the cytoplasm, whereas the remaining nine proteins were found in the nucleus. This suggests that the MsEIL5 gene might exhibit altered nucleoplasmic localization, potentially leading to distinct functions or activities within the nucleus.
Table 1
Physicochemical properties of identified MsEIN3/EIL genes in alfalfa.
Gene Name | Gene ID | Chr | Amino Acids | Molecular Weight (Da) | Isoelectric Point | Subcellular Localization | Instability Index | Average of Hydropathicity |
MsEIL1 | MsG0280011087 | Chr2 | 810 | 91682.22 | 6.77 | Nucleus | 61.58 | -0.653 |
MsEIL2 | MsG0380011801 | Chr3 | 566 | 63911.38 | 5.01 | Nucleus | 47.3 | -0.703 |
MsEIL3 | MsG0380011928 | Chr3 | 577 | 64885.43 | 4.95 | Nucleus | 47.3 | -0.678 |
MsEIL4 | MsG0380015861 | Chr3 | 638 | 72458.81 | 5.91 | Nucleus | 47.83 | -0.758 |
MsEIL5 | MsG0580029523 | Chr5 | 256 | 28702.52 | 5.28 | Cytoplasm | 54.81 | -0.504 |
MsEIL6 | MsG0680034184 | Chr6 | 591 | 67376.84 | 9.04 | Nucleus | 45.13 | -0.788 |
MsEIL7 | MsG0680034186 | Chr6 | 619 | 70770.16 | 8.97 | Nucleus | 53.09 | -0.74 |
MsEIL8 | MsG0680034200 | Chr6 | 653 | 74096.42 | 8.96 | Nucleus | 42.16 | -0.698 |
MsEIL9 | MsG0680034213 | Chr6 | 694 | 79272.84 | 8.97 | Nucleus | 46.75 | -0.683 |
MsEIL10 | MsG0880043875 | Chr8 | 628 | 71288.97 | 6.41 | Nucleus | 45.13 | -0.767 |
Chromosome distribution and phylogenetic analysis of MsEIN3/EIL genes
To determine the chromosome distribution of the identified MsEIN3/EIL genes, a chromosome map of MsEIN3/EIL genes was constructed based on the alfalfa genome sequence, and they were located on five specific alfalfa chromosomes unevenly: Chr2, Chr3, Chr5, Chr6, and Chr8 (Fig. 1A). Among these chromosomes, Chr2, Chr5, and Chr8 each contained one MsEIN3/EIL gene, while Chr3 contained three MsEIN3/EIL genes, representing 30.0% of the total MsEIN3/EIL genes. Chr6 contained the most MsEIN3/EIL genes, accounting for 40.0%. These findings imply the diversity and complexity of the members of MsEIN3/EIL genes. To further understand the evolutionary relationships of the MsEIN3/EIL proteins, we conducted a comparative analysis of EIN3/EIL proteins from M. truncatula, A. thaliana, and O. sativa. Then the neighbor-joining method and the JTT model in MEGA11 were used to construct a rootless phylogenetic tree. As shown in Fig. 1B, the EIN3/EIL proteins were divided into four clades, designated as A, B, C, and D. Clade A contains AtEIN3, AtEIL1, AtEIL2, MsEIL4, MsEIL5 and MsEIL10 proteins. Clade B consists of AtEIL4 and AtEIL5 proteins. Clade C includes the AtEIL3 and MsEIL1 proteins, and Clade D contains MsEIL2, MsEIL3, MsEIL6, MsEIL7, MsEIL8 and MsEIL9 proteins. Therefore, it is hypothesized that MsEIL1, MsEIL4, MsEIL5 and MsEIL10 genes evolved from the AtEIN3/EIL genes, while the other genes may be new genes that arose from the MsEIN3/EIL genes in the course of species evolution.
Gene structure, conserved motif and domain analysis of MsEIN3/EIL proteins
Gene structure analysis is essential for understanding the relationship between evolution and the functional differentiation of gene families. To study the diversity of MsEIN3/EIL protein motifs, the online server MEME was used to analyze the conserved motifs of MsEIN3/EIL protein sequences. A total of ten unique conserved motifs were found in this analysis, named motif 1 to motif 10 (Table S2). Notably, motif 1, 3, and 4 were found to be universally present in all genes and located at the N-terminal region. This finding aligns with the reported conservation of the N-terminal region among members of EIN3/EIL genes in Arabidopsis (Fig. 2A). Furthermore, motifs 5, 7, 8, and 10 were found to be exclusive to Clade D members, suggesting that these motifs may serve a unique function that distinguishes the role of these proteins from other MsEIN3/EIL proteins. In addition, most closely related MsEIN3/EIL exhibited a similar motif composition, suggesting that they may have functional redundancy.
According to the NCBI CDD database, it was observed that the Clade D members and MsEIL4 protein belong to the EIN3 family, while the protein domains of the other members belong to the EIN3 superfamily (Fig. 2B). To further explore the structural diversity of the identified MsEIN3/EIL genes, the exon-intron structures of these genes were analyzed. As shown in Fig. 2C, the MsEIN3/EIL genes displayed a range of 0 to 9 introns, with similar clustering patterns. Among the ten genes, most genes contained two introns, including MsEIL2, MsEIL3, MsEIL6, MsEIL7, and MsEIL8 genes. In addition, MsEIL1 gene contained 9 introns, MsEIL9 and MsEIL5 genes contained 3 introns, while MsEIL10 and MsEIL4 genes have 1 and 0 intron, respectively.
Gene duplication and synteny analysis of MsEIN3/EIL genes
Furthermore, we conducted a thorough examination of potential gene duplication events and found a fragment duplication event involving two MsEIN3/EIL genes located on chromosome 3 (Fig. 3, Table S3). The results showed that some genes of MsEIN3/EIL may be caused by gene duplication events, which are the main factors contributing to the amplification of the MsEIN3/EIL gene family. These findings reveal the genomic structure and evolutionary relationship of the MsEIN3/EIL genes in alfalfa, providing important insights into its potential functional significance in stress response and plant development.
To gain a deeper understanding of the gene replication mechanism of the MsEIN3/EIL gene family, we conducted an analysis of the collinearity between the MsEIN3/EIL genes and various plant species, such as dicotyledonous plants like A.thaliana, M.truncatula, G.max, and monocotyledonous plants like O.sativa and Zea mays. The results revealed multiple homologous gene pairs between M.sativa and dicotyledonous plants (A. thaliana, M. truncatula and G. max), but no homologous pairs with monocotyledonous plants (O. sativa and Z. mays), providing insights into their evolutionary relationship (Fig. 4,Table S4-S6). The results showed that the MsEIN3/EIL genes in alfalfa underwent significant evolutionary divergence and were homologous in dicotyledons.
Amino acid sequence alignment and secondary structure analysis of MsEIN3/EIL proteins
To evaluate the similarity of the MsEIN3/EIL protein sequences of alfalfa, a multiple sequence alignment analysis was conducted on protein sequences of AtEIN3 protein and ten MsEIN3/EIL proteins. The analysis revealed a high degree of conservation in the EIN3/EIL protein sequences. The protein sequences of MsEIN3/EIL displayed characteristic structural features of the EIN3/EIL protein, including a completely conserved EIN3 domain at the N-terminus, an amino-terminal acidic domain (AD), a proline-rich region (PR), and five small basic domains (BD I-V). In addition, MsEIN3/EIL proteins have poly-asparagine regions and poly-glutamine regions near the C-terminus, which is similar to that observed in other plant species such as Arabidopsis and mung beans (Fig. 5). The N-terminal sequences of MsEIN3/EIL proteins is highly conserved, whereas the C-terminal sequences show little similarity, suggesting that the changes in MsEIN3/EIL members are mainly due to variations in the C-terminal sequences. The acidic amino acid enrichment region, proline and glutamate enrichment region are common transcriptional activation regions in plants, indicating that the acidic amino acid region, the basic amino acid region and the proline enrichment region are the transcriptional activation regions and functional regions of the EIN3/EIL gene family.
Studying the secondary structure of proteins is essential for comprehending their function. Therefore, we conducted an in-depth analysis of the secondary structure of all MsEIN3/EIL proteins. Among ten MsEIN3/EIL proteins, random coils accounted for the largest proportion (37.96 ~ 59.72%), followed by α-helix (23.57 ~ 41.84%), extended strand (7.05 ~ 21.18%), and β-turn (1.88 ~ 6.85%) (Table 2).
Figure 5 Sequence alignment of AtEIN3 protein and all identified MsEIN3/EIL proteins in Alfalfa. Sequences were aligned by ClustalX, and identical or similar residues were shaded as colors. Black rectangle covers the structural features. AD: acidic domain; BD I-V: basic domain I-V; PR: proline-rich region; ploy N/Q: poly asparagine / glutamine region.
Table 2
The secondary structure of MsEIN3/EIL proteins
Gene Name | Gene ID | α-Helix(%) | Extended Strand(%) | β-Turn(%) | Random Coil(%) |
MsEIL1 | MsG0280011087 | 36.3 | 13.95 | 4.07 | 45.68 |
MsEIL2 | MsG0380011801 | 39.4 | 10.95 | 5.48 | 44.17 |
MsEIL3 | MsG0380011928 | 34.84 | 9.88 | 4.16 | 51.13 |
MsEIL4 | MsG0380015861 | 31.35 | 7.05 | 1.88 | 59.72 |
MsEIL5 | MsG0580029523 | 37.5 | 9.38 | 5.47 | 47.66 |
MsEIL6 | MsG0680034184 | 39.09 | 12.69 | 4.57 | 43.65 |
MsEIL7 | MsG0680034186 | 41.84 | 14.54 | 5.65 | 37.96 |
MsEIL8 | MsG0680034200 | 41.81 | 11.33 | 6.28 | 40.58 |
MsEIL9 | MsG0680034213 | 41.64 | 13.26 | 5.62 | 39.48 |
MsEIL10 | MsG0880043875 | 23.57 | 21.18 | 6.85 | 48.41 |
Promoter region cis-acting regulatory elements analysis |
The analysis of cis-acting regulatory elements identified fourteen major cis-acting elements in the MsEIN3/EIL gene promoter sequences (Fig. 6, Table S7-8). Among all the identified MsEIN3/EIL genes promoter sequences, light-responsive cis-acting elements accounted for the largest proportion (57.2%) and were classified into the first category. Hormone-responsive cis-acting elements including auxin, abscisic acid, gibberellin, methyl jasmonate and salicylic acid were the second largest category (15.6%), Anaerobic induction cis-acting elements formed the third largest category (9.2%). Other categories, such as low-temperature elements, binding site related elements, plant developmental elements, defense stress elements and others, accounted for 18% of the total. All ten identified MsEIN3/EIL gene promoter sequences in alfalfa contained hormone-responsive cis-acting elements, suggesting that these genes may interact with other hormones to regulate plant growth. The promoter regions of all identified MsEIN3/EIL genes contained low-temperature response elements and anaerobic induction response elements. The promoter region of MsEIL1 gene contained defense stress response elements, while the promoter regions of seven MsEIN3/EIL genes contained MYB binding sites (MBS) associated with drought induction. The results of cis-acting elements indicate that MsEIN3/EIL genes can respond to various hormones and stresses, and these response elements may directly influence the stress response ability of MsEIN3/EIL genes under stressful conditions.
Expression pattern of MsEIN3/EIL genes in tissues
Tissue expression pattern analysis is important to understand the specific function of MsEIN3/EIL genes in different tissues of alfalfa. The transcript abundance profiles of the MsEIN3/EIL genes in six tissues, including leaves, flowers, post-elongated stems, elongated stems, roots, and seeds, were assessed using RNA-Seq data. The expression profiles were then visualized as a heat map using TBtools software to depict the expression patterns. The results showed that only MsEIL1, MsEIL4 and MsEIL5 genes were expressed in the various tissues, while the remaining seven MsEIN3/EIL genes were not detected in the different tissues of alfalfa (Fig. 7,Table S9).
To validate the RNA-Seq data, real-time quantitative PCR (RT-PCR) was performed on the MsEIN3/EIL genes. The results indicate that all MsEIN3/EIL genes were expressed in various tissues except for MsEIL2 and MsEIL9 genes. The expression patterns varied across different tissues. Specifically, the MsEIL4 gene showed high expression levels in flowers and seeds, while the MsEIL5 gene exhibited high expression in flowers. On the other hand, MsEIL10 gene, which belongs to the same group A as MsEIL4 and MsEIL5 genes, displayed elevated expression levels in roots and stems, suggesting that this gene may have undergone functional changes during the process of polyploidization. Additionally, the MsEIL1 gene, categorized under group C, was highly expressed in seeds. In contrast, the MsEIL3 MsEIL6, MsEIL7, and MsEIL8 genes, belonging to group D, demonstrated high expression levels in both roots and stems (Fig. 8).
Expression profiles analysis of MsEIN3/EIL genes under stresses
The analysis of cis-acting elements in MsEIN3/EIL genes showed that nearly all genes had elements responsive to cold, drought, and abscisic acid (ABA) (Table S8). To further investigate the expression levels of MsEIN3/EIL genes under abiotic stresses, we analyzed their expression patterns under cold stress, drought stress, salt stress, and ABA treatments using published transcriptomic data (Fig. 9). To ascertain the dynamic changes in the expression level of MsEIN3/EIL genes in response to cold and drought treatments, the transcriptome data of the alfalfa seedlings subjected to different durations of cold and drought treatments were analyzed. Interestingly, only half of the MsEIN3/EIL genes showed changes in expression levels. During the cold treatment, the expression levels of MsEIL1, MsEIL2, MsEIL4 and MsEIL5 genes were significantly increased after 2 h of treatment and subsequently decreased after 6 h of treatment (Fig. 8A,Table S10). In contrast, the expression of MsEIN3/EIL genes exhibited an opposite trend during drought treatment. MsEIL1, MsEIL4, MsEIL5 and MsEIL6 genes were significantly down-regulated after 1 h of drought treatment, and then up-regulated after 6 h of treatment (Fig. 8B,Table S11). This suggests that these genes may be involved in the response to low temperature and drought stress.
In the transcriptome data of root tips treated with salt, we observed a significant down-regulation of the expression levels of MsEIL1, MsEIL4, MsEIL5 and MsEIL6 genes after 1 h of salt treatment, followed by an up-regulation after 3 h of treatment (Fig. 8C,Table S12). In the transcriptome data of ABA-treated root tips, the expression levels of MsEIL1, MsEIL2, MsEIL4 and MsEIL5 genes were significantly increased after 1 h and subsequently decreased after 3 h of treatment (Fig. 8D,Table S13). Comprehensive analysis of the data from the four treatments revealed that the expression trend of MsEIN3/EIL genes during cold treatment was similar to that during ABA treatment, while the expression trend of MsEIN3/EIL genes during drought treatment was similar to that during salt treatment. Consequently, it is hypothesized that the MsEIN3/EIL genes may respond to cold stress by regulating the ABA synthesis pathway, and the processes of MsEIN3/EIL genes responding to drought and salt stress may share similar characteristics.