Identification of annexin genes in the maize pan-genome
Both an HMMER search and the Blastp program were used for identification. 12 ZmAnn genes were identified in the maize pan-genome, of which 9 were core genes, 3 were near-core genes (Table. S1). Although ZmAnn12 was not found in the reference genome, a collinear segment with the gene was identified, a similar situation was observed in the M162W and NC350 genomes. The CML52 genome lacks the ZmAnn7 and ZmAnn8 gene, and the Ki11 genome lacks the ZmAnn4 gene. However, the other 24 genomes analyzed in this study contain all 12 ZmAnn genes or their corresponding collinear segments. This suggests that the absence of these genes in the CML52 and Ki11 genomes may be due to genetic variation or deletion events.
According to this analysis, the physicochemical properties of the annexin genes are shown in Table S2, The annexin genes encoded proteins ranging from 85 (B97_Ann3) to 446 (NC350_Ann9) amino acids in length, with isoelectric points (pIs) ranging from 5.4 (HP301_Ann10, Ki11_Ann10, Mo18W_Ann10, M126W_Ann10) to 9.6 (NC350_Ann9) and molecular weights (MWs) varying from 9.1 (B97_Ann3) to 48.6 kDa (NC350_Ann9). The instability index ranging from 30.57 (CML69_Ann5) to 84.91 (B97_Ann3) and the aliphatic index ranging from 76 (CML103_Ann9) to 104.71 (B97_Ann3). The GRAVY values ranged from − 0.534 (M126W_Ann1) to 0.232 (B97_Ann3). Except for B97_Ann3 and M126W_Ann1, all other annexin proteins are hydrophobic proteins. Furthermore, subcellular localization results revealed that the majority of the genes (58.3%) are located in mitochondria, while the remaining genes are distributed in the cytoplasm (18.2%), chloroplasts (22.4%), and nucleus (0.6%) (Table S2.). Specifically, only two genes, CML247-Ann9 and M126W-Ann1, were found to be located in the nucleus. Some ZmAnns have the same localization results in 26 genomes, such as ZmAnn2, ZmAnn4, ZmAnn5, and ZmAnn12.
Phylogenetic and analysis of maize annexin genes
In order to investigate the phylogenetic relationships among maize annexin genes within maize genomes, a neighbor-joining (NJ) model tree was constructed. This tree was generated utilizing protein sequences derived from 12 distinctly identified maize annexin genes, in conjunction with 8 annexin genes sourced from Arabidopsis. The ZmAnn genes were categorized into six distinct subgroups (Fig. 1A). With the exception of group VI, each subgroup contains Ann proteins derived from Arabidopsis. Group I comprises three ZmAnn genes, while group VI consists of a single ZmAnn gene. The remaining four groups each contain two ZmAnn genes. Figure 1B illustrates the presence or absence of ZmAnn genes, across 26 maize varieties. Notably, ZmAnn7 and ZmAnn8 are absent in CML52, while ZmAnn4 is missing in Ki11. All other genes are either present or have collinear segments in all 26 genomes. This observation suggests that the Ann gene family is relatively conserved across different maize varieties.
ZmAnn is subjected to different selection pressures among maize varieties
Variations between the aligned sequences may lead to amino acid changes (nonsynonymous substitutions) or maintain the same amino acids (synonymous substitutions). Quantifying these changes provides insight into the extent of sequence alteration. The ratio of nonsynonymous substitutions per nonsynonymous site (Ka) to synonymous substitutions per synonymous site (Ks) serves as an indicator of the selective pressures acting on the protein. The Ka/Ks values of the majority of Ann genes are less than 1, suggesting that these genes have undergone purifying selection during the evolutionary process. However, some genes in ZmAnn10 exhibited values greater than 1, indicating that they underwent positive selection in some materials. (Fig. 2).
Cis-element analysis of ZmAnn
In the analysis of the 2 kb upstream sequences of the genes, the online tool PlantCARE database was utilized for cis-element prediction. This analysis identified in addition to the core cis elements, a total of 64 distinct cis-elements within the 2000 bp upstream region from the transcription start sites of the ZmAnn. These cis-elements play a significant role in various biological processes, including stress responses, hormone responses, metabolic regulation, as well as growth and development. All Ann genes contain varying numbers of light response elements, with the G-box being the most prevalent element, present in each gene. Among hormone-responsive cis-elements, the abscisic acid responsive element (ABRE) is the most abundant, with B97-Ann2,Mo18W-Ann2, Oh43-Ann2 and Tx303-Ann2 containing as many as 15 ABRE elements (Fig. 3, Table. S3, S4). In addition, it contains various types of stress response components, such as ARE (anaerobic induction), GC-motif (anoxic specific inducibility), TC-rich repeats (defense and stress responsiveness), MBS (drought inducibility), LTR (low-temperature responsive). TCA-element (salicylic acid responsiveness). All ZmAnn9 and the majority of ZmAnn6 in the 26 genomes contain the LTR (Table. S4), suggesting that these two genes may associated with maize low-temperature stress tolerance. In the context of metabolic response processes, the O2-site, a cis-acting regulatory element involved in zein metabolism regulation, was the only element successfully identified. Notably, meristem expression elements (CAT-box) were identified in all ZmAnn8, while involved in seed-specific regulation were present in ZmAnn11.
Multiple MYB binding sites (MBS, MBSI, MRE) were identified in the promoter regions of the 26 genomes(Table. S3), suggesting that the Ann gene can interact with MYB transcription factors and participate in various processes, including photoresponse, drought induction, and regulation of flavonoid biosynthesis genes.
Annexin gene family is affected by structural variations
Abundant SVs were identified by aligning 26 high-quality maize genomes with reference genome (B73) in the study by Hufford et al. (2021)20. Compared with the reference genome, the main types of SVs are deletion (59), followed by insertion (14) and translocation (3), overlapped with gene regions of ZmAnn as well as upstream and downstream 2-kb regions (Fig. 4A Table. S5). Structural variants (SVs) can impact the expression of genes by altering the composition or position of adjacent cis-regulatory sequences. We analyzed the correlation between expression values of genes containing and not containing SVs, and the results revealed a significant difference (p < 0.05) only in ZmAnn2 and ZmAnn11 (Fig. 4B). This finding suggests that the expression of ZmAnn2 and ZmAnn11 is influenced by the presence of SVs.
To investigate the impact of SVs on the gene structure within the Ann gene family, we examined the gene structures of Ann across 26 accessions using TBtools. The results indicate that the gene structure of most Ann is consistent with the reference genome B73. However, some Ann genes that overlap with SVs have undergone structural changes. For instance, in ZmAnn3, most genes contain 5 exons, whereas in the B97 accession, there is only 1 exon (Fig. 5A). A similar situation is observed in ZmAnn10, where the accessions M37W, P39, CML33, and CML103 contain two exons, with only 14 materials harboring all 10 motifs (Fig. 5B). Other genes exhibit varying degrees of structural variation, which may potentially result in functional changes.
Atypical Ann genes were widely expressed in maize
SVs are a significant factor contributing to changes in the protein spatial folding structure. To further investigate the impact of SVs on the Ann gene family, we quantified the number of typical (containing 4 annexin repeats) and atypical (containing 1–3 annexin repeats) Ann genes in the 26 genomes. In most of the materials, ZmAnn1, ZmAnn7, ZmAnn9, and ZmAnn10 were found to be typical genes, while ZmAnn2, ZmAnn4, ZmAnn9, and ZmAnn11 were identified as atypical genes (Fig. 6A). The remaining genes exhibited a coexistence of both types, a few genes with only collinear fragments cannot be counted.
The majority of the materials contain 7–8 typical genes, with only a few materials (CML103, M162W and Oh7B) having as few as 5 typical genes (Table. S6). To determine if there is a relationship between the number of Ann genes and their total expression levels among different varieties, we quantified the number and total expression of Ann genes in 26 materials (Fig. 6B). The number of Ann genes in each material is relatively consistent, ranging from 10 to 12, but the total expression levels vary significantly, with the highest at 374.14 (Ms71) and the lowest at 122.30 (Mo18W). Correlation analysis was conducted to examine the relationship between the number of typical genes, the number of atypical genes, and the total number of Ann genes with their total expression levels. The results showed correlation coefficients (r) and significance test p-values of -0.048 and 0.818, respectively, for typical genes, 0.002 and 0.990 for atypical genes, and 0.003 and 0.985 for the total number of Ann genes. These findings indicate that there is no significant correlation between these genes and their total expression levels.
Expression profiles of ZmAnn under stress conditions based on RNA-Seq data
Analysis of cis-acting regulatory elements in the promoter regions of Ann genes suggests their potential involvement in various stress response pathways. To further investigate the expression patterns of Ann genes under various biotic and abiotic stress conditions, we obtained transcriptome data for biotic and non-stress treatments from public databases (Table S7), including Aspergillus flavus infection (Fig. 7A), aphid infestation (Fig. 7B), salt and mannitol treatments (Fig. 7C), drought treatment (Fig. 7D), cold stress (Fig. 7E) and heat stress (Fig. 7F). The expression levels of ZmAnn6 and ZmAnn8 were upregulated to different extents following Aspergillus flavus infection (Fig. 7A). Additionally, after two hours of aphid infestation, elevated expression levels were observed for ZmAnn5, ZmAnn6, ZmAnn7, and ZmAnn8 (Fig. 7B). During drought treatment, the expression levels of some Ann genes are upregulated, such as ZmAnn2, ZmAnn5, ZmAnn8, and ZmAnn11 (Fig. 7D). ZmAnn4 exhibited higher expression levels during the early stages of cold treatment, specifically at 0.5h, 1h, and 2h, while ZmAnn8 showed increased expression at later time points, namely 16h and 24h (Fig. 7E). These findings suggest that the Ann gene family plays a role in cold stress response at different time intervals. Under heat stress conditions, the expression levels of certain Ann genes in the thermosensitive maternal plants of An'nong 591 were found to be upregulated. Collectively, these results underscore the multifunctional role of the ZmAnn gene family in various stress response pathways, highlighting their importance in plant defense mechanisms.
Co-expression network and GO enrichment analysis of ZmAnn under cold stress condition
Cold reduces both the seed germination rate and seedling vigor, as exposure to low temperatures during the water absorption phase (imbibition) impairs cell membrane permeability, resulting in the loss of cellular components21. To further elucidate the role of ZmAnn in response to cold stress, we obtained expression data from Xue et al. (2021)22 on various maize tissues subjected to cold stress treatment (Table.S8) for Weighted Gene Co-Expression Network Analysis(WGCNA). This analysis revealed 19 co-expression modules, with gene counts ranging from 157 to 2810 (Fig.S1, Table.S9). Subsequently, correlation analysis was conducted on co-expressed genes utilizing diverse treatment information. Modules exhibiting higher correlation coefficients were identified in three distinct tissues: blue, brown, and turquoise (Fig.S2). The genes in these modules have similar expression patterns. A total of 4 Ann genes were found to be involved in these three co-expression modules. Specifically, ZmAnn2 and ZmAnn7 were clustered in the turquoise module, which is primarily associated with stress response processes, such as cellular protein modification, carboxylic acid metabolism, ubiquitin-like protein transferase activity, and organic acid metabolism (Fig. 8A). In addition, ZmAnn9 was also found to be clustered in the blue module, which is mainly related to various biosynthetic processes, such as amide biosynthesis, peptide biosynthesis, and organic nitrogen compound biosynthesis (Fig. 8B). ZmAnn6 is located in the brown module related to carboxylic acid metabolism, response to abiotic stimuli, and cofactor metabolism processes (Fig. 8C). Notably, ZmAnn2 and ZmAnn7 are co-expressed with three genes (Table.S10), among which LRR-RLKs (LRR-like receptor kinases) FEI1 is associated with cellulose. The loss-of-function mutant of FEI1 exhibits increased sensitivity to humidity gradients and decreased tolerance to osmotic stress23. Among the other co expressed genes, SR34A targets all alternative splicing event types, including in RNAs encoding known determinants of ABA sensitivity, to prevent ABA-responsive splicing in germinated seeds24. Expression of JMJ25 was induced significantly by darkness, suggesting that JMJ25 might be involved in stress responses25. FtsH2 may be involved in cold stress response processes by affecting the ABA-dependent signaling pathways26.