Phylogenetic analysis of the DUF668 gene family in cotton
To explore the phylogenetic relationship of the cotton DUF668 genes, a phylogenetic tree was constructed. Four different cotton DUF668 gene protein sequences (Table S1) were used. All of the DUF668 proteins can be divided into 4 subgroups (Fig. 2). The number of DUF668 genes in each subgroup of G. hirsutum and G. barbadense was basically twice the number in each subgroup of G. arboreum and G. raimondii. This was consistent with the results of the previous analysis and conforms to the evolutionary relationship in cotton. The results showed that the DUF668 genes were relatively conserved in evolution in cotton. Although the third subgroup had relatively few members, they were retained during evolution in cotton, which indicated that they may play an important role in biological processes.
According to the number of genes, chromosome location and phylogenetic tree analysis, DUF668 was predicted to be relatively conserved in cotton. To study the evolutionary relationship of DUF668, we selected G. hirsutum as the core and constructed the collinearity relationship in G. hirsutum related to other cotton species (Fig. 3). We found that 13 sequences for DUF668 family genes from the subgenome A in G. hirsutum had collinearity with 17 sequences in G. arboreum and G. barbadense. Except for the Gh_DUF668-30 gene, one sequence for the DUF668 family genes in the subgenome D in G. hirsutum had collinearity with one sequence in G. raimondii and G. hirsutum. However, 11 sequences in G. barbadense and 13 sequences in G. raimondii had collinearity with 15 and 14 sequences in G. hirsutum, respectively. This was basically consistent with the analytical results of the DUF668 family genes in the A subgroup. Surprisingly, except for Gh_DUF668-30, each sequence of DUF668 family genes in either subgenome A or D in G. hirsutum corresponded to only one sequence in G. arboreum and G. barbadense. This shows that the DUF668 family genes may have been lost during evolution in G. hirsutum; later, they were duplicated due to functional requirements, making them consistent with the number in G. arboreum. This illustrated the complexity of DUF668 family gene functions.
Phylogenic tree, motif and gene structure of the DUF6688 genes in G. hirsutum
The phylogenetic tree, gene structure and motif were analyzed according to the full-length coding sequence (CDS) and protein sequence of the GhDUF668 genes (Fig. 4). According to the distribution and quantity of gene structures and motifs, DUF668 members were divided into 2 broad groups in G. hirsutum, which was consistent with research in rice. Except for GhDUF668-06, 24 and 32, the rest of the members had the same motif (1, 2, 3, 4, 5, 6, 7, 10), indicating that the same family members had similar functions. One exon and four identical motifs (1, 2, 3, 10) were observed in the first subgroup, but introns were not contained. However, the length between the exons was different. Except for GhDUF668-06, which contained 6 motifs (1, 2, 3, 5, 9, 10) and 6 exons, the second broad group contained 10 motifs and 12 exons. The difference between the structures of GhDUF668-06, 24 and 32 in the same group might be due to changes in the function of the gene or errors in genome annotation. Further study is required. A motif is a structural component with a specific spatial conformation and function in a protein molecule, which is a subunit of a structural domain and connects with a specific function. This result suggests that the second broad group might have changed its gene structure during the evolutionary process and might have a more important function in cotton growth and development than originally thought.
Cis-acting element analysis of the DUF668 gene in G. hirsutum
The 2000 bp promoter region upstream of the GhDUF668 genes was extensively analyzed. Various cis-acting elements were found in defense mechanisms, stress responses, salicylic acid, ABA, gibberellin, auxin, jasmonic acid, light responses, drought induction, MYB binding sites for flavonoid synthesis, and responses to low temperature, which are related to plant hormones and environmental stress (Fig. 5, Table S2). Previous grouping results showed that the second group contained more cis-acting elements than the first group, indicating that the second subgroup might have a more important function under adverse stress conditions in cotton growth and development. Each GhDUF668 promoter contained different numbers and types of cis-acting elements, indicating that they might participate in different biotic and abiotic stress responses through the different signaling pathways.
Tissue-specific expression analysis of DUF668 genes in G. hirsutum
Gene expression patterns are usually related to gene functions. Our analysis of the expression patterns of the GhDUF668 gene in roots, stems, leaves, pistils, stamens, calyxes, petals and receptacles in cotton showed that most of the selected 32 GhDUF668 genes had tissue expression specificity (Fig. 6) that can be divided into 3 expression patterns. Seven genes (GhDUF668 − 01, 18, 05, 22, 24, 09, 27) could be divided into the first expression pattern, and expression was mainly expressed in the pistil in roots, stems, and receptacles with the lowest expression in petals and stamens. Nine genes (GhDUF668-02, 08, 10, 11, 15, 19, 26, 28, 31) could be divided into the second expression pattern, with most of the expression in the stems. The rest were classified as the third expression pattern, and the expression level of the eight tissues in this pattern was low. The gene expression of the GhDUF668 genes was specific and contained more complex functions.
Expression analysis of the DUF668 gene in G. hirsutum in response to stress
The expression analysis of the GhDUF668 genes after cold, heat, drought and salt treatment showed that the expression patterns of the GhDUF668 genes can be divided into three categories (Fig. 7). After cold treatment, the expression of nine genes (GhDUF668 − 01, -02, -05, -09, -18, -19, -22, -24, -27) from the second and third categories significantly changed (Fig. 7A), indicating that the expression of these nine genes could be induced by cold stress. These genes might play a corresponding role in the response to cold stress in G. hirsutum. Under heat stress conditions, the expression of fourteen genes (GhDUF668 − 01, -02, -05, -08, -09, -11, -14, -18, -19, -22, -24, -26, -27, -28) from the second and third categories was obviously upregulated and reached the maximum value at 12 hours (h) (Fig. 7B), indicating that these genes might play a role in heat resistance in G. hirsutum. Polyethylene glycol (PEG) was used to simulate drought stress. The expression of sixteen genes (GhDUF668 − 01, -02, -05, -08, -09, -10, -11, -14, -18, -19, -22, -23, -24, -26, -27, -28) was upregulated and reached a maximum at 12 h (Fig. 7C), indicating that these sixteen genes might play a role in drought resistance in cotton. After salt stress treatment, the expression of fifteen genes (GhDUF668 − 01, -02, -05, -08,- 09, -10, -11, -14, -18, -19, -22, -24, -26, -27, -28) from the second and third categories was upregulated and reached a maximum at 12 h (Fig. 7D), indicating that these fifteen genes might play a role in salt tolerance in cotton. In summary, nine genes (GhDUF668 − 01, -02, -05, -09, -18, -19, -22, -24, -27) were induced by various abiotic stresses, implying that they might be affected by nonbiological stress in cotton. Although most of the genes had obvious expression changes and reached their maximum at 12 h, RNA-seq measurement was performed at only 12 h. We speculated that these genes might continue to increase their expression after 12 h under abiotic stress conditions in G. hirsutum. In summary, the expression levels of approximately half of the GhDUF668 genes significantly changed with higher tissue-specific expression levels under abiotic stress conditions. This result indicated that the GhDUF668 genes might play a role in the abiotic stress response of G. hirsutum. However, further verification is needed.
Expression analysis of the DUF668 gene in G. hirsutum in response to Verticillium wilt stress
Cotton production is restricted by V. dahliae, which has a serious impact and causes great economic losses every year. This study was based on the transcriptomic data of cotton roots after inoculation with V. dahliae. The results showed (Fig. 8) that these GhDUF668 genes could be clustered into three expression patterns within 0 ~ 120 h. The expression of twenty genes (GhDUF688 − 01, -02, -05, -06, -08, -09, -10, -11, -14, -15, -18, -19, -21, -22, -23, -24, -26, -27, -28 and − 31) from the second and third expression patterns changed. Among them, the expression levels of six genes (GhDUF688 − 06, -11, -15, -23, -28, and − 31) reached a maximum at 6 h and then continued to decline, indicating that these six genes might be involved in the response to V. dahliae treatment in the early stage of cotton. The expression levels of GhDUF688 − 08 and GhDUF688 − 26 reached a maximum at 48 h and then decreased, indicating that these two genes may play a role in the middle and late stages in response to V. dahliae infection in cotton.
Response of the DUF668 gene in G. hirsutum to drought and Verticillium wilt stress by qRT-PCR
Adverse stresses can cause transcriptome reprogramming events. Our previous analysis showed that most of the expression of the GhDUF668 genes undergoes significant changes under stress treatment conditions. Based on the transcriptome expression profiles and promoter cis-acting elements, we speculated that five genes (GhDUF668-05, -08, -11, -23 and − 28) from the GhDUF668 genes might be involved in resistance to stress conditions. We selected KK 1543 (drought-resistant), Xinluzao 26 (drought-susceptible), Zhongzhimian 2 (disease-resistant)[25] and Simian 3 (disease-susceptible) (unpublished) to determine whether these genes were involved in the response to adverse stress. qRT-PCR was used to detect the transcription levels of these five GhDUF668 genes in roots under drought and Verticillium wilt treatment at the seedling stage. Compared with that at 0 h, the expression of these 5 genes was significantly different after the roots were stressed (Fig. 9). The transcription levels of these five genes were significantly induced after adverse stress treatment.
After the roots of disease-resistant and susceptible materials were inoculated with V. dahlia, the RNA transcription levels of all selected genes were significantly induced at different periods in the two materials; thus, the expression levels were significantly increased, suggesting that these genes might participate in the process of responding to the invasion of V991 in cotton. Among them, GhDUF668-08 and GhDUF668-23 reached their maximum values at 12 h in disease-resistant materials. The maximum expression of GhDUF668-08 was observed at 24 h in susceptible materials, and expression was lower in the susceptible materials than in the disease-resistant materials. Although GhDUF668-08 and GhDUF668-23 were significantly induced in both materials, the expression level in disease-resistant materials was dramatically higher than that in susceptible materials. Minor changes in the expression levels of the other three genes were observed. In summary, these five genes might play a role in the process of responding to the invasion of V. dahlia in cotton. Among them, GhDUF668-08 and GhDUF668-23 might play leading roles in disease resistance.
After PEG simulated drought treatment, the transcription levels of all selected genes were significantly induced at different periods in the two materials, and the expression levels were significantly increased, suggesting that these genes may be involved in the response to drought conditions in G. hirsutum. GhDUF668-08 and GhDUF668-23 were significantly induced at 6 h, and their expression increased sharply, reaching a maximum at 12 h. The changing expression trends of GhDUF668-08 and GhDUF668-23 under drought treatment were consistent with those of inoculation with V991. However, the expression level of GhDUF668-08 in G. hirsutum under drought treatment was not significantly different from that in G. hirsutum inoculated with V. dahlia. In summary, these 5 genes might have a role in the drought stress response in cotton. Furthermore, GhDUF668-23 might play a leading role in this process.
The same expression pattern was observed among five genes (GhDUF668-05, -08, -11, -23 and − 28) in different materials under biotic stress and abiotic stress. Compared with expression at 0 h, expression of 5 genes at other timepoints changed significantly. This implied that the expression of these genes was regulated by adverse stresses and might play a certain role in the process of responding to adverse stress in G. hirsutum. Among them, the expression levels of both GhDUF668-08 and GhDUF668-23 increased sharply at 6 h and reached a maximum at 12 h with drought stress and V. dahlia inoculation treatment, respectively. The expression levels of GhDUF668-08 and GhDUF668-23 were significantly upregulated among the five genes in the same period. This further shows that GhDUF668-08 and GhDUF668-23 might play certain roles in the response to biotic and abiotic stress in cotton.
Tissue-specific expression analysis of the DUF668 gene in G. hirsutum
We proved that the expression of GhDUF668-08 and GhDUF668-23 changed dramatically under different stress conditions, and significant differences in their expression were observed among different materials. Moreover, due to the specific parts of gene expression, the molecular biological functions were different. The first part of the plants to sense the response to drought stress and pathogen invasion is the root. For this reason, we selected the periods that could strongly verify the tissue-specific expression of these five candidate genes in the four materials during treatment. We found that the expression in roots was significantly higher than that in stems and leaves (Fig. 10). Interestingly, expression of these five genes changed significantly in the roots before and after different stresses, but severe changes occurred in the stems and leaves. Among them, the GhDUF668-08 and GhDUF668-23 genes changed significantly in the roots. This result showed that the increased expression of the GhDUF668 genes in the roots was likely to enhance cotton resistance. These results further illustrated the certain role of the GhDUF668 genes in stress resistance. Our results lay a molecular foundation to further investigate the function and molecular mechanism of these genes in stress tolerance.