Identification of SBTs in eight plant species
We used HMMER search (based on Peptidase S8 domain) to identified a total of 702 genes in the eight plant species, Arabidopsis thaliana, Oryza sativa, Zea mays, Triticum aestivum, Asparagus officinalis, Dendrobium catenatum, Phalaenopsis equestris and Rosa chinensis. The detailed information about selected SBT genes is available in Additional file 1. And the identified numbers were 57 in Arabidopsis thaliana, 63 in Oryza sativa, 66 in Zea mays, 255 in Triticum aestivum, 75 in Asparagus officinalis, 71 in Dendrobium catenatum, 51 in Phalaenopsis equestris and 64 in Rosa chinensis, respectively. Notably, through our used method, the numbers of identified genes in Arabidopsis and rice are similar same to the past two researches, 56 in Arabidopsis (Rautengarten et al 2005) and 63 in rice (Tripathi & Sowdhamini 2006), respectively.
Multiple sequence alignment of SBTs in eight plant species
The total of 702 genes and protein sequence were compared and the multiple sequence alignment result was shown as Fig1. All the SBT protein sequences were listed in the Additional file 2. And the similarity comparison matrix was shown in Additional file 3, which was used to draw the alignment heatmap. Fig1, the alignment heatmap, showed that many SBTs shared their sequences with high similarity. However, there were still some sequences showed less similarity with others. This indicated some SBT genes could have specific function in life activity. In addition, we selected the Peptidase S8 domain from Arabidopsis and rice to compare sequence similarity (Fig S1). The result showed that the Peptidase S8 domain displayed high similarity in some sequence areas, but in some other areas, the similarity exposed very low. This signposted conserved feature of Peptidase S8 domain but it was also variable in Arabidopsis and rice.
Phylogenetic analysis and classification of SBT genes in eight plant species
Additionally, the phylogenetic analysis indicated that SBTs can be divided into 9 clades among the eight plants (Fig 2). The subtrees with gene names are shown in Fig. S2. Clade Ⅰ mainly contains the SBTs from Asparagus officinalis, Dendrobium catenatum, Phalaenopsis equestris, these three spices belonging to Asparagales. Clade Ⅱ is mainly constructed with subfamilies of SBT in wheat and maize. Clade Ⅲ contains less subfamily members of SBT with some genes in wheat, rice, maize, Asparagus officinalis, Dendrobium catenatum, Phalaenopsis equestris and Arabidopsis AtSBT 1.2. Clade Ⅳ contains SBT genes from the 7 spices including rice OsSub19 gene and Arabidopsis Acetyl-Coa Carboxylase 2 (ACC2) gene (This gene has a similar protein domain as peptidase S8 domain). Clade Ⅴ is a bigger clade with more SBT genes mainly from monocotyledon, with Arabidopsis SBT6.1 as well. Clade Ⅵ is constructed mainly with Arabidopsis and wheat SBT genes. In this Clade, it has Arabidopsis SBT genes like AtSBT3.5, AtSBT 3.11, AtSBT5.1 and wheat TraesCS7B02G015300. Clade Ⅶ is formed mainly with Arabidopsis “AtSBT4” genes (like AtSBT 4.1 and AtSBT4.9) , wheat and Rosa chinensis SBT genes. Clade Ⅷ contains SBT genes from the 7 plants, and the Arabidopsis SBT gene AtSBT2.5 is noticeable. Clade Ⅸ mainly contains SBT genes from wheat. We compared the SBTs’ sequences from Arabidopsis and rice (Fig S2). The sequences showed the serval conserved domains in the front and the latter part of SBTs. That indicated SBTs function were conserved in some way.
Motif and gene structure analysis of SBTs gene family in Arabidopsis and rice
Because of abundant research resources for Arabidopsis and rice (Two model plants), to analyze the gene sequence structure of SBT, we selected SBT genes in Arabidopsis and rice for further analysis. We used MEME software to analyze the gene structure and motif of SBTs genes of Arabidopsis and rice (Fig 3). 10 motifs were found, and the detailed motif information was exposed in Additional file 4. Most of SBTs have conserved motif region with motif 6, 3, 7, 4, 8, and located in the front part of SBTs. On the other hand, the motif 2, 5, 1, 9, 10 are alternatively arranged in the latter part of SBTs. Many Arabidopsis SBT genes shared similar motifs such as AT1G322940, AT4G10550 and AT4G10540 with motif 6, 3, 7, 4, 8, 1, 5. Likewise, in rice, some SBT genes also were very similar because of same motifs like Os01g0795100, Os01g0794800 and Os01g0795000 with motif 6, 3, 7, 4, 8, 1, 5, which suggested they probably shared similar function in biological process.
Gene Structure analysis showed that SBTs in rice and Arabidopsis displayed different gene structure (Fig 3). Rice SBTs’ gene structure totally showed less introns for many isoforms. For example, Os03g0430500, Os04g0558900, Os05g0435800, these three genes have no introns. However, some Arabidopsis SBT genes showed much more introns in the gene structure, like AT1G20160, AT4G10550 and AT1G32960. which indicated a different translation pattern existed in Arabidopsis and rice.
Chromosomal location of SBTs in Arabidopsis and rice
We further analyzed chromosomal location and collinearity of SBTs in Arabidopsis and rice. The results showed that 14, 4, 4, 17 and 18 SBT genes located in Arabidopsis chromosome 1, 2, 3, 4, 5, respectively (Fig 4). On the other hand, in rice, 11, 14, 7, 15, 2, 3, 2, 2, 4, 2, 1, 1 SBT genes located in chromosome 1, 2, 3 ,4, 5, 6, 7, 8, 9, 10, 11, 12, respectively. It is noticeable that many SBT genes showed “gene cluster arrangement” in the chromosome. For example, AT1G32940, AT1G32950, AT1G32960, AT1G32970, AT1G32980 are closely arranged in Arabidopsis chromosome1. And Os01g0794800, Os01g0795000, Os01g0795100, Os01g0795200, Os01g07954000 are closely arranged in rice chromosome 1, too. This suggested that these genes might had happened tandem duplication with similar sequences and biological functions.
Collinearity analysis of SBTs genes in Arabidopsis and rice
We additionally analyzed collinearity of SBTs in Arabidopsis and rice by MCscanX software (Fig 5). All of the genome collinearity genes were shown in Additional file 5 and marked with gray lines in fig 4. On the other hand, the analysis showed that there existed 4 and three SBT collinearities in Arabidopsis and rice, respectively. In Arabidopsis, AT1G32950/AT4G10520, AT2G19170/AT4G30020, AT3G46840/AT5G59090, AT4G20430/AT5G44530, these 4 collinearity pairs were found. On the other hand, in rice, Os01g0868800/Os05g0435800, Os02g0665300/Os04g0558900, Os03g0119300/Os10g0524600, these three collinearity pairs were then found. The detail gene information was showed in Additional file 5. These results indicated that these genes might replicated through segmental duplication process.
Synteny analysis of SBT genes among Arabidopsis, rice, wheat, and maize
To further conclude the phylogenetic mechanisms of SBT genes, we created comparative syntenic maps of SBT genes among the 4 representative plant species, including Arabidopsis (dicot) and three monocots (wheat, rice and maize). Through our procedure, the total gene pairs of Arabidopsis/rice, Arabidopsis/wheat, Arabidopsis/maize, rice/wheat, rice/maize were 2959, 6740, 2088, 44263, 8873, respectively (Fig 6). It was obvious that the monocot species (rice, wheat and maize) showed more genome similarity because of more gene pairs. And the synteny draft of rice/wheat and rice/maize were more likely the results of chromosome replication (Fig S4). Additionally, the numbers of SBTs’ orthologous pairs of Arabidopsis/rice, Arabidopsis/wheat and Arabidopsis/maize were 3, 10, 3, respectively. Besides, the numbers of orthologous pairs of rice/wheat and rice/maize were 92 and 21, respectively. The more orthologous pairs existing in rice/wheat and rice/maize indicated that SBTs had processed more duplication procedures in monocot class.
SBTs genes expression status analysis in Arabidopsis with different organs and development stages
The expression patterns of SBT genes in the typical model plant, Arabidopsis, was explored through Arabidopsis eFP Browser (http://bar.utoronto.ca/efp_arabidopsis/cgi-bin/efpWeb.cgi). As a result, different SBTs showed different expression form for different development stage and part of Arabidopsis. The detailed data were displayed in Additional file 6. Fig 7 showed that SBT genes displayed distributed expression pattern in Arabidopsis. For instance, AT5G58820, AT4G21630 and AT4G15040 mainly expressed in mature flowers and young siliques, indicating they might involve in these organs development. However, some SBT genes, like AT2G05920 and AT4G20430, showed high expression in hypocotyl, root and shoot, suggesting a possible diverse role in some biological process. Then again, some members also showed another expression pattern such as AT5G45640 showing high expression level in the most organs compared with other SBTs except flower. AT2G04160 and AT1g32960 exhibited high expression levels in the rosette part. In contrast, AT4G34980, AT5G51750, AT2G19170, AT5G59120, AT3G46850, AT5G59810, AT5G58830 and AT4G21323, these genes displayed less expression levels in the most organs except flowers, young siliques. These results indicated that different SBTs obviously showed different expression patterns for different organs and development stages.
SBT genes in Arabidopsis response to exogenous environments
Furthermore, to better understand the influence of stress conditions to SBTs, we downloaded the data of multiple exogenous conditions to SBTs expression pattern. As a result, these serval stress conditions including salt, heat, osmotic stress, cold, oxidative, wound, drought were presented (Fig 8). The complete data were displayed in Additional file 7. Diverse SBT gene showed different responsible models to stress. Some genes, such as AT4G34980, AT5G51750 showed a continuous high expression level for all the conditions. Alternatively, some of genes showed changed expression in some certain conditions, like AT3G14087, AT2G39850, AT1G01990 in cold, drought, heat. And some others displayed comparatively low-grade expression level for all the conditions, such as AT1G32960, AT5G58840. These results signposted that SBTs exhibited diverse expression patterns in different conditions, and suggested they might perform different functions in altered life process.