Identification, phylogenetic analysis of the GASA gene family in the plant kingdom
According to the results of BLASTP and Query Sequence Searching of TBtools, the GASA protein sequences from different plants were collected and revised the correctness. To study the evolutionary relationships of GASA genes, the protein sequences, including Arabidopsis, G. darwinii, G. mustelinum, G. tomentosum, G. raimondii, G. arboretum, G. barbadense, G. hirsutum were exploited using phylogenetic tree (Fig. 1A). GASA proteins have a quite conservative in phylogenetic relationships between Arabidopsis and Gossypium, and further classified into three subfamilies viz. GASA1/2/3/9/11/14, GASA4/5/6/12/13, and GASA7/8/8L/10. The vertical homologous gene corresponding to each AtGASA gene could be found in different Gossypium species, and most GASA genes did not have doubling occurred in diploid cotton compared to Arabidopsis such as GASA1/2/3/4/5/6/9/14/12, and occurred doubled such as GASA7/8/8L/10/11/13. These results showed that the GASA7/8/8L/10/11/13 might act as a key role for cottons. Allotetraploid cotton should have twice the number of diploid cotton genes, but the number of GASA genes were less than twice. This result showed that during the evolution process some GASA genes were lost, which is inline with the previously published statistics demonstrating higher gene losses in allotetraploid cotton as compared to diploid cotton [31]. Subsequently, GASA genes were explored in 20 species, extending from lower plants to higher plants, to make certain the origin and evolutionary relationship of these genes (Fig. 1B). Based on the genes number analysis, GASA genes were present in lower fern plants Selaginella moellendorffii, but not in the lower algae plants Micromonas pusilla, Ostreococcus tauri, and Volvox carteri and moss Physcomitrella patens, indicating that the GASA genes might have originated in fern. From the origin of ferns to angiosperms, the number of GASA genes has hardly changed in diploid plants. This result indicated that the number of GASA family gene are conserved in most plants.
Structural characterizations of GhGASA genes and motif and analyses
To exploit the evolutionary relationships GhGASA family genes along with their structure and function, an unrooted phylogentic tree was constructed utilizing GhGASA protein sequences. In general, GhGASA genes possessed one to three exons. Furthermore, genes structure and phylogenetic relationship displayed highly correlation. In total, 20 conserved motifs were identified in GhGASA protein sequence (Fig. S1). The number of conserved motifs in each GhGASA varied from 3 to 11. Most GASA proteins contain the conserved Motif 1–3, showing that the three Motif of GASA proteins may have important role for the functional conservative.
To study the evolutionary relationships and functional divergence of the prominent GASA gene-family members, we extracted and examined the upstream 2.0 kb promoter regions. Many cis-acting regulatory elements, including 13 elements related to plant growth (including, Photo-responsive, cell cycle, and seed-specific reulation) and stress responses (including, hormone-response, wound-response, and defense response towards stresses), were analyzed (Fig. 2B).
We mainly focused on cis-acting regulatory elements to verify gene functions concerning cotton fiber development. GhGASA genes promoter active elements and cotton transcriptomic data revealed that fiber development might not be linked to the cell cycle regulation, and seed-specific regulation, it might be linked to hormone-responsive elements such as IAA, GAs (Fig. 2A). It might be suggested that IAA and GAs act important roles in fiber development and cell elongation.
RNA-seq expression profile of GASA genes in three major cotton species
Expression profile and tissue specificity were explored using transcriptomic data of G. arboretum, G. barbadense, G. hirsutum. Firstly, we constructed phylogenetic tree of GASA family genes of three major cotton species using the relative expression profiles with TBtools. The majority of GASA family-genes from the same subfamily had similar expression patterns in six varieties in three major cotton species (Fig. 3, Fig. S2, and Fig. S3).
In G. hirsutum (Fig. 3), only four genes are highly expressed at fiber development stages. Gh.A09G018000 and Gh.D04G053600 depicted relatively high expression at the early stages of fiber development. However, the Gh.A04G144000 and Gh.D04G1827 were highly expressed throughout whole fiber development stages, especially the critical period of fiber elongation for 5–15 DPA.
In G. arboretum (Fig. S2), the study found that the Ga14G0224.1 significantly higher expression level in all the tissues. While Ga07G1350.1 was only high expressed in leaves, and Ga04G0326.1 was only highly expressed in different fiber development stages, especially during the critical period of fiber elongation.
In G. barbadense (Fig. S3), only three genes were highly expressed at crucial fiber development stages; especially two genes viz. Gbar.D04G017490 and Gbar.A04G012790 showed significant expression levels at the critical period for fiber elongation (10-DPA).
Interestingly, the five genes Ga04G0326.1, Gbar.D04G017490, Gbar.A04G012790, Gh.A04G144000, and Gh.D04G182700 were all vertical homologous of AtGASA10 in three cotton species. Regardless of the evolutionary relationship, gene structure, and expression changes were consistent among the six varieties of G. arboretum, G. barbadense, G. hirsutum. These results emphasized that higher expression and tissue specificity of these genes viz. Ga04G0326.1, Gbar.D04G017490, Gbar.A04G012790, Gh.A04G144000, and Gh.D04G182700 might play a direct critical role in fiber development and fiber cell elongation.
In this study, we identified five GASAs genes sited at A04/D04, they were explicitly expressed at critical fiber development stages in three major cotton species, which emphasized that the GASA10 might have a vital role in fiber development, specifically fiber cell elongation. To understand the genetic basis, characteristics, and functions of GhGASA10-1 (Gh.D04G182700), we further performed functional verification.
Subcellular localization of GhGASA10-1
According to the online tool analysis, TMHMM2.0 (http://www.cbs.dtu.dk/services/TMHMM/) predicted that GhGASA10-1 has the 26 N-term signal sequence with transmembrane and the other sequence outside the membrane. The CELLO version 2 [32]and Euk-mPLoc 2.0 [33] predicted that the subcellular localization of GhGASA10-1 is extracellular. YLoc [34]and BaCelLo [35] predicted that localization of GhGASA10-1 is a secreted pathway.
To verify this prediction, the full-length CDS of GhGASA10-1 was ligated with 35S-1300-GFP vector. The constructed vector was infiltrated into N. benthamiana mature leaves and visualized by confocal microscopy. The fluorescence of 35S-GFP was detected in nucleus and the cytomembrance (Fig. 4). In contrast, the GhGASA10-1::GFP fusion protein was localized in cell membrane appearing green, and cell membrane presented red fluorescence stained by cell membrane marker Dil. Subsequently, the cell membrane was merged into yellow by both GhGASA10-1::GFP fusion protein and cell membrane marker Dil. The above results demonstrated that GhGASA10-1 was localized in the cell membrane, which may synthesize secreted protein transport to the cell wall involved in cell wall synthesis and promote cotton fibers cell wall development through the secreted pathway.
Overexpression and tissues specificity analysis of GhGASA10-1 in Arabidopsis
To further confirm the gene function, GhGASA10-1 was overexpressed in Arabidopsis. Among 10 lines GhGASA10-1-overexpressing transgenic Arabidopsis of T3 generation, 3 lines were selected for further analysis. Tissue specificity expression analysis (Fig. 5A) showed that GhGASA10-1 is explicitly expressed in the roots at the seedling stage. However, GhGASA10-1 is significantly down-regulated in the roots and specifically expressed in the flower buds at the flowering stage.
When grown on 1/2 MS medium, wild-type and 3 transgenic lines exhibited a noticeable phenotypic difference in Arabidopsis seedling germination stages. The seedling germination after 14 days of standard cultivation (Fig. 5B), GhGASA10-1-overexpressed seedlings germinated vigorously than wild type seedlings. Statistics on seeds germination rate showed that transgenic seeds' germination rate was significantly higher than the wild type, especially on the third day (Fig. 5C). The length of the taproot after 14 days of vertical cultivation (Fig. 5D), GhGASA10-1-overexpressed seedlings formed more main roots than wild type seedlings. The seed from wild type and transgenic plants viz., OE1, OE2, OE3 were germinated on MS-agar medium to further verify the transgenity. The results comprehend the trasgenity, as shown in Fig. 5D, the root growth was observed with significantly higher expression in the transgenic lines than in the wild-type plants. Biological statistics (Fig. 5E) showed that the length of the root of transgenic lines was twice than wild-type of the length of Arabidopsis seedling stages. These results showed that GhGASA10-1 promotes seedling germination and root extension in Arabidopsis.
As mentioned above GhGASA10-1 was screened as a putative candidate gene for fiber cell elongation. However, the tissue specificity of this gene in Arabidopsis suggested marked changes in root elongation. To validate our hypothesis that GhGASA10-1 play a crucial role in cell elongation, we further examined the roots of Arabidopsis and compared both the transgenic and wild type. Interestingly, we observed that overexpression of GhGASA10-1 extensively promotes root length with the elongation of root cells instead of an increase in the number of cells (Fig. 5F/G). These results strengthen our hypothesis that GhGASA10-1 may be important in fiber cell elongation.
GhGASA10-1 expression level associates with cellulose synthesis
As the over-expression of GhGASA10-1 in Arabidopsis promote cell elongation resulting in root elongations, we speculated that GhGASA10-1 might promote downstream transcription factors, leading to high expression of cellulose synthase genes, further promoting cell elongation. Comparing the expression of cellulose synthase genes (AtCesAs) (Fig. 6) in wild-type and overexpressing Arabidopsis, five members of the AtCesAs family were found to be upregulated. Among them, AtCesA5b/9 were upregulated twice, while AtCesA4/7 were upregulated three times, and AtCesA10 was upregulated more than ten times. Taken together, this data provides strong evidence that over-expression of GhGASA10-1 strongly induced cellulose synthesis associated genes and promote cell elongation.
GhGASA10-1 induced by IAA but not GA3
It has been shown that GASA family genes being involved in the regulation of phytohormones in different plants and act as binding promoter element in upland cotton. IAA or GA3IAA might regulate GhGASA10-1 or GA3 might regulate GhGASA10-1, so we take hormone-treated ovule in vitro culture to verify GhGASA10-1 expression level. Surprisingly, qRT-PCR (Fig. 7) results showed that GhGASA10-1 was upregulated during crucial fiber elongation stages. Moreover, GhGASA10-1 was regulated by IAA, but not GA3 (Fig.S4) during cotton fiber development when treated with different concentrations of hormone IAA, GA3 and their inhibitors in cotton ovule.