Phylogenetic analysis of the tubulin protein family
To investigate the evolutionary relationships of tubulin genes, the tubulin protein sequences from O. sativa, Arabidopsis, L. usitatissimum, G. arboreum, G. barbadense, G. raimondii, and G. hirsutum were used to generate an unrooted phylogenetic tree (Fig.1). In those species, the tubulin proteins were mainly classified into 3 bunches e.g. α, β and γ-tubulins, which revealed similar results to previous studies in different plants (Jayaswal et al. 2019). Most of the orthologous genes between the allotetraploid and the corresponding diploids were clustered into the same clade based on phylogenetic analysis with tubulin protein sequences, which indicates that before the separation of monocots and dicotyledons, the genes had been differentiated into 3 different subfamilies. In each bunch, most tubulin proteins from four cotton species were relatively farther than tubulin proteins from O. sativa, Arabidopsis, L. usitatissimum, but a few had special case including some genes in O. sativa, which closely related to cotton species. γ tubulins are abundant gathering together from O. sativa, which may play a vital role in development of rice, but which are relatively scarce from Arabidopsis, only AT.5G05620 and AT.3G61650. Interestingly, the α, β-tubulins subfamilies are extended, but γ-tubulin subfamily is relatively small in cotton. The above results show that tubulin genes are lost or expanded during evolution, according to the needs of plant growth.
Gene structure conserved motifs analysis
In this study, we analyzed the conserved motif and structural characterizations to further investigate the phylogenetic relationships among different members of the tubulin family of G. hirsutum (Fig.2). Members of α, β-tubulin genes are quite conservative, yet γ-tubulin genes are not retained in protein motifs and gene structure analysis. There was a considerable distinction in the exon-intron structure in different branches, but similar in the exon-intron structure in same branches. Most of the genes had four introns in α-tubulin subfamily, and had three introns in β-tubulin subfamily, yet had five to eleven introns in γ-tubulin subfamily. The MEME program was used to identify 10 conserved motifs. There are almost all conserved sites in α, β-tubulin subfamilies. However, the genes of Gh.D05G257500, Gh.D12G088700 and Gh.D04G181300 may lose one or two of motifs 4/5/8 in α, β-tubulin subfamily. Most γ-tubulin genes are poorly conserved in compared to member of α, β-tubulin genes. Strangely, the Gh.A11G102000, Gh.D11G102600 and Gh.A12G127300 contain ten conserved motifs in γ-tubulin subfamily. The above results show that most members from the same subfamily have similar motif features, and exon-intron structure, supporting the close evolutionary relationships.
As the (table 2/Fig. S1), the results showed that the most α, β-tubulin proteins have conserved domains with Tubulin and Tubulin_C; pIs of the most tubulin proteins are weakly acidic, however, when the proteins loss the Tubulin_C, the pIs of proteins are alkaline. Conserved site analysis of tubulin gene family were found that some amino acid are very conserved (Fig. S2), The BaCelLo (Pierleoni et al. 2006), EpiLoc (http://epiloc.cs.queensu.ca/) and Plant-mPLoc (Chou and Shen, 2010) predicted that the subcellular localization of all tubulin proteins are Cytoplasmic/Chloroplast. TMHMM2.0 (http://www.cbs.dtu.dk/services/TMHMM/) and TOPCONS (Tsirigos et al. 2015) predicted that all tubulin proteins are located outside the membrane. This results indicated that the tubulin genes may synthesize the cytoskeleton in plant cells cytoplasm.
Chromosomal location, gene duplication, and collinearity relationships
Chromosomal location showed all tubulin genes scattered unequally on 24 chromosomes except At10 and Dt10 in G. hirsutum (Fig.3), which are mainly situated on the proximate ends of the chromosomes. At03 and Dt05 of G. hirsutum contained the most number of tubulin genes (7). By contrast, At02, At04, At06, Dt04, Dt07, Dt12 of G. hirsutum only included one or two of tubulin genes. There are deduced that genetic variation existed in the progress of evolution, and certified by this uneven distribution of the tubulin genes on the chromosomes. We further analyze the collinearity relationships for all tubulin genes on the chromosomes At and Dt of G. hirsutum with other cotton species. Among 42 tubulin genes on the chromosomes At of G. hirsutum, 33 tubulin had intergenomic homologous genes in chromosomes Dt of G. hirsutum (Fig.4); 40 tubulin homologous genes in G. arboretum (Fig. S3); among 48 tubulin genes on the chromosomes Dt of G. hirsutum, 48 tubulin had intergenomic homologous genes in G. raimondii (Fig. S4); and among all the 91 tubulin genes of G. hirsutum, 77 tubulin had intergenomic homologous genes in G. barbadense (Fig. S5), 91 tubulin homologous genes in G. darwinii and G. tomentosum(Fig. S6/7), respectively. Above results demonstrated that it may be an independent evolutionary process for At subgroup and Dt subgroup after the formation of allotetraploid, while it may be a same evolutionary direction for different allotetraploid cotton species upland cotton and wild cotton after the formation of allotetraploid.
Expression patterns of tubulin genes in different upland cotton
Although most tubulin superfamily genes have redundant functionality, they have similar functions together to promote cell elongation, supporting the close evolutionary relationships. To research the particular features of tubulin genes in fiber elongation and development, the expression patterns analyses was performed. Firstly, our RNA-sequencing data were used to detect the expression patterns of 98 tubulin genes in root, stem, leaf, ovule, and fiber tissues of G. hirsutum (J02-508 and ZRI-015) using a heatmap. The results showed that most α, β-tubulin genes had higher expression levels relative to γ-tubulin genes in all tissues, and the expression levels of most α, β-tubulin genes tubulins subfamilies give obvious intimation to functional divergence (Fig.5). Secondly, the expression levels of tubulin genes in the fiber development level had higher than other tissues e.g. Gh.A08G206600, Gh.D08G202200, Gh.A11G351700, Gh.D11G355300, Gh.A08G207900, Gh.A03G027200, Gh.D03G169300, Gh.A11G082000, Gh.D11G082500, Gh.A13G236900, Gh.A11G258900, Gh.A05G199500, Gh.D05G216200, Gh.D02G226100, Gh.A03G209100, Gh.A04G142400, Gh.D04G181300, Gh.A02G095500 and Gh.D02G097200, which may contribute to fiber development. Among them, only Gh.A03G027200, Gh.D03G169300 and Gh.A11G258900 had clear varies at distinct stages of fiber development in the two varieties, and they were significant differences at 15 DPA, which was a critical time for fiber length and strength in J02-508 compared to ZRI-015. Above results indicated the three genes may inhibit fiber length and strength in key stages of fiber development.
The expression levels of Gh.A03G027200, Gh.D03G169300 and Gh.A11G258900 in different fiber developments for J02508 and ZRI015 were detected using qRT-PCR. As shown in Fig 6, the relative expression data revealed that the expression levels of Gh.A03G027200 and Gh.D03G169300 were the higher from 10 to 25 DPA in J02508 and ZRI015, and the expression levels of Gh.A11G258900 were higher from 3 to 15 DPA in the as well as twice in ZRI015 compared to J02508.