Identification of CmbHLH proteins and conserved domain alignment
A total of 169 CmbHLH candidate protein sequences were obtained by HMM analysis, 213 protein sequences were found using blast method, then repetitive sequences were removed. The remaining sequences were searched against the CmbHLH proteins of the PlantTFDB database. After that 214 sequences were reserved and submitted to CDD domain search; then 159 sequences were found with a bHLH conserved domain above the minimum domain hit, and the redundant sequences of the 159 proteins were removed, finally, 118 sequences were left as the CmbHLHs gene models for the analysis and renamed based on their chromosome localization (Supplementary Table S1).
These putative CmbHLH lengths varied from 84 to 707 aa, the molecular weight ranges from 9.48 kDa to 75.97 kDa, and the theoretical isoelectric points (PI) range from 4.52 to 10.27 (Supplementary Table S1). The CmbHLH proteins density (0.442) was lower than that in Cucumis sativus (0.527), which is also belongs to the Cucurbitaceae plant. The reason may be the smaller genome size of Cucumis sativus (203.0 Mb) compared to Cucumis melo (364.0 Mb) [27]. And the CmbHLH density was a little higher than that in Citrus sinensis (0.420) which has a similar genome size (Citrus sinensis, 367.0 Mb).
Multiple sequences alignment of the bHLH domain of CmbHLH proteins shows that 24 amino acid residues in their bHLH domains were conserved with more than 50% consensus ratio. The bHLH domain is highly conserved and comprises of two functionally distinct regions. The basic region of the bHLH domain determines the DNA-binding activity of target genes, figure 1 shows the bHLH domain logo of CmbHLH (Fig. 1), the basic region of CmbHLH proteins contains 5 conserved amino acids, and the HLH has 19 conserved amino acids. Previously described a classification of bHLH proteins that classified the bHLH proteins into four groups A, B, C and D. The classification was based on DNA-binding specificity as well as conservation of amino acids at certain positions [13]. According to the criterion, 8 CmbHLH classified into A group, 69 located in B group, 16 and 33 CmbHLH belonged to C and D group respectively.
Domain analysis of CmbHLH illustrated that there were two kinds of domains were found in CmbHLH except the bHLH domain (Fig. 2). One domain is bHLH_MYC_N, was found in 11 CmbHLH. All of the CmbHLHs bHLH_MYC_N domain located in the N-terminal of the proteins and bHLH domain in the C-terminal of proteins. Apart from the bHLH and bHLH_MYC_N domains, ACT (Aspartokinase, Chorismate, and TyrA) domain was also identified in eight CmbHLH (CmbHLH32, CmbHLH37, CmbHLH56, CmbHLH60, CmbHLH68, CmbHLH97, CmbHLH100, CmbHLH114), and all of the ACT domain located in the C terminal of the bHLH domain.
Phylogenetic, Motif Analysis and Gene Structure of CmbHLH
To evaluate the evolutionary relationships of the CmbHLH, we conducted a phylogenetic analysis based on full-length of protein sequences. Applying the ML method, we assigned the CmbHLH genes into 16 subfamilies and 4 orphan genes (Fig. 3). Subfamilies A, D and J were the largest groups, the smallest subfamily (L) had only 2 members. According to the phylogenetic tree, the CmbHLH binding activities was phylogenetically clustered which were consistent with the previously report. For example, nine subfamilies members (A, B, C, H, I, J, K, M and O) were belonged to B group protein, four subfamilies members (E, F, G and M) classified into group D protein.
The evolutionary relationships of these CmbHLH proteins were also determined by conserved motifs. A total of 10 conserved motifs were characterized from CmbHLH proteins (Fig. 4A, B). Among these motifs, motif 1 and 2 were annotated to bHLH domain (IPR011598, IPR036638), motif 7 and 10 were annotated to transcription factor bHLH-MYC-N-terminal (IPR025610). The subfamily L contains the highest number of motifs (six motifs). CmbHLH14 and CmbHLH94 possess two and three motif-2 respectively. The motifs distribution and construction pattern exhibited similar model within subfamilies.
The exon-intron organizations of CmbHLH were examined to gain more insight into the evolution of the bHLH family in melon. The exon number of CmbHLH varied from 1 to 11 (Fig. 4C). Whereas, the exon-intron organizations were phylogenetically related. For example, The CmbHLH with one exon were clustered in two subfamilies (D and K), all the members of subfamily I have two exons. Intron distribution analysis of bHLH domain within all CmbHLH proteins exhibit 13 intron distribution patterns, and this pattern strongly related to the subfamilies of CmbHLH (Fig. 4D). As shown in figure 4, 85% of CmbHLH have intron insertion in their bHLH domain sequence region. Although the intron positions and lengths were varied, only five intron insertion positions of CmbHLH were unconserved. Overall, the conserved motif arrangement and composition and the gene structure of CmbHLH genes, together with the phylogenetic analysis results, could strongly support the reliability of the classification.
Chromosomal distribution and collinearity analysis of CmbHLH
CmbHLH genes were distributed unevenly among twelve chromosomes of melon (Fig. 5). However, the distribution of CmbHLH genes did not show either a chromosome length correlation or a phylogenetic correlation. Gene tandem duplication may involve in gene family enlargement and maintains of gene copy numbers. Thus, we analyzed the tandem duplication events of CmbHLH. Five genes were confirmed to be tandem duplicated genes. Three of them (CmbHLH81, CmbHLH82 and CmbHLH83) located on chromosome 8, and two of tandem duplicated genes (CmbHLH45 and CmbHLH44) located on chromosome 4.
To further infers the origin and phylogenetic relationships of bHLH genes, comparative collinearity analysis between Cucumis melo and other cucurbit species were conducted. Figure 6 displayed the collinearity relationship of CmbHLH genes with those in bottle gourd, cucumber, watermelon and Cucurbita maxima (Rimu) (Fig. 6). A total of 115 CmbHLH genes have orthologous in the four species, among them 95 CmbHLH genes were common in the four species. Interestingly, 43 CmbHLH genes have at least two orthologous in Rimu, and these genes spread out on the 12 chromosomes of melon. The reason maybe that Rimu genome underwent a whole-genome duplication (WGD) event, which was not observed in other four cucurbits (cucumber, melon, watermelon, and bitter gourd) [28]. Gene duplication events of CmbHLH in melon were also studied. Results show 38 CmbHLHs genes duplicated among CmbHLH genes (Supplementary Figure S1). Except chromosome 9, all the other chromosomes have duplication genes of CmbHLH. Most of the duplication genes were on chromosome 2, and 1 CmbHLH locates on chromosome 6, illustrating an uneven distribution of the duplication genes.
Expression pattern of CmbHLH in melon fruit development
Analysis of the expression data of PRJNA543288 exhibits 161 transcripts of 98 CmbHLH genes effectively expressed (expressed at least in two replicate libraries) in melon fruit Growth stage (G), Ripening stage (R), Climacteric stage (C), and Post-climacteric stage (P) samples. However, most of them were low expressed, 45 CmbHLH genes have an average expression higher than 10 FPKM, only 5 genes (CmbHLH23, CmbHLH32, CmbHLH41, CmbHLH67 and CmbHLH79) expression higher than 100 FPKM (Fig. 7A). Differential expression analysis shows 32 CmbHLH genes differentially expressed in G vs R, R vs C and C vs P stage samples. A total of 21 CmbHLH genes differentially expressed in G vs R samples (7 genes upregulated and 14 genes downregulated), CmbHLH32 was the highest expressed among these differential expression genes; 26 CmbHLH genes differentially expressed in R vs C samples, only 2 genes (CmbHLH9 and CmbHLH114) were up regulated, the others were down regulated. There were six CmbHLH genes differentially expressed in C vs P stage samples. Taken together, the expression of CmbHLH genes exhibits a down regulation trend from melon fruit G to P stage samples, suggesting most of the CmbHLH genes may function in early fruit developmental stage. Whereas, two genes (CmbHLH9 and CmbHLH114) up regulated in R vs C stage samples, indicating they may be involved in the regulation of fruit Climacteric.
Overexpression of CmbHLH32 leading to early ripen in melon fruit
To further investigate the function of CmbHLH genes in fruit ripening, we generated the transgenic plant lines of overexpression CmbHLH32 (CmbHLH32-OE). The reasons for studying CmbHLH32 gene were: first, CmbHLH32 was one of the highest expressed CmbHLH genes, second; CmbHLH32 was highest expressed among differential expression genes in G vs R stage samples; third, result in tissue expression analysis of CmbHLH32 illustrates CmbHLH32 was high expressed in female flower and early developmental stage of fruit (Fig. 7B); fourth, CmbHLH32 was identified homolog to AtbHLH93, which was proved to control Arabidopsis flowering by repressing MAF5 [29]. However, blast analysis was failed to find a homolog gene of MAF5 in melon, and CmbHLH32 also contains an ACT domain which is not find in AtbHLH93, suggesting CmbHLH32 may have a different function in melon flowering.
Transgenic T1 seeds that overexpression of CmbHLH32 was generated by the ovary injection method. Fruit ripening related phenotype observation of CmbHLH32-OE T1 plant that were PCR detection positive indicates that overexpression of CmbHLH32 results in early fruit ripening compared to the wild type (WT) melon fruit (Fig. 7C). The fruit ripening of CmbHLH32-OE line (about 38.7± 1.1 DAP) is in average 4 days earlier than that of WT melon fruit (about 42.6± 0.8 DAP). Quantitative RT-PCR analysis exhibits that the expression level of CmbHLH32 gene has increased an average about 5.5 times than WT fruit (Fig. 7D). Transcriptional activity of CmbHLH32 was also studied by yeast two hybrid, however, neither monomer nor homodimerizes of CmbHLH32 shows transcription activation activity in yeast (Fig. 8A).
Analysis of fruit weight, length, width and fruit soluble solids content shows no different between WT melon fruit and CmbHLH32-OE transgenic line fruits. However, CmbHLH32-OE transgenic plant fruits exhibit less firmness than WT melon fruits (Table 1). Expression correlation networks analysis using the transcriptome data suggesting 94 genes correlated with CmbHLH32 (Supplementary Table S2). Further Gene Ontology (GO) analysis of correlation genes revealed GO term plant-type cell wall biogenesis (GO:0009832) was enriched (Fig. 8B), suggesting CmbHLH32 function may affect fruit softening through plant cell wall synthesis.
Table 1 Phenotypes of wild type (WT), CmbHLH32-overexpressing (CmbHLH32-OE) transgenic melon plants.
Parameter
|
WT
|
CmbHLH32-OE L1
|
CmbHLH32-OE L2
|
fruit weight (g)
|
961.8±155.3
|
954.1±169.4
|
890.5±113.1
|
Hrizontal diameters of fruit (cm)
|
12.7±0.9
|
12.7±0.9
|
12.6±0.7
|
Vertical diameters of fruit (cm)
|
13.6±1.0
|
13.6±0.7
|
13.1±0.7
|
soluble solids content (%)
|
13.7±1.0
|
12.7±2.9
|
11.6±1.9
|
firmness of fruit (Kg)
|
5.5±0.6a
|
4.5±0.4b
|
4.0±0.2b
|
Values are means of 3-10 plants, ±SE. The statistical significance of mean differences was analyzed using Student' t-test, P < 0.05