Identification and basic information of the CsPP2C gene family
In this study, we determined that 56 putative CsPP2C were present in the cucumber genome through BLASTP by using 80 AtPP2C protein sequences as references. From the analysis of their physical and chemical properties (Table 1), the 56 CsPP2C genes identified encode proteins varying from 233 to 813 amino acids in length, with large variations in isoelectric point (pI) values from 4.5 to 9.61 and molecular weight from 30 kDa to 90 kDa. The total average hydrophobic index of the 56 CsPP2C gene family members were all less than zero were hydrophilic proteins. The subcellular localization prediction indicated that most of the CsPP2C proteins might be located in nuclear, chloroplast or cytoplasmic, while only CsPP2C48 might be located in endoplasmic reticulum and CsPP2C50 might be located in cytoskeleton.
Table 1
List of 56 CsPP2C genes and their basic characterizations
Gene identifier
|
Gene name
|
Size (aa)
|
Mass (kDa)
|
pI
|
Instability index
|
Aliphatic index
|
Grand average of hydropathicity
|
Subcellular localization
|
CsaV3_1G004080.1
|
CsPP2C1
|
348
|
37.934
|
7.67
|
28.63
|
89.97
|
-0.299
|
Nuclear
|
CsaV3_1G034580.1
|
CsPP2C2
|
426
|
46.884
|
5.06
|
57.49
|
73.66
|
-0.45
|
Nuclear
|
CsaV3_1G035680.1
|
CsPP2C3
|
358
|
39.549
|
5.26
|
69.22
|
72.4
|
-0.499
|
Chloroplast
|
CsaV3_1G036330.1
|
CsPP2C4
|
392
|
42.073
|
5.97
|
52.12
|
83.93
|
-0.194
|
Chloroplast
|
CsaV3_1G038430.1
|
CsPP2C5
|
370
|
41.493
|
8.01
|
38.05
|
84.84
|
-0.254
|
Nuclear
|
CsaV3_1G038460.1
|
CsPP2C6
|
428
|
46.271
|
7.96
|
32.07
|
87.01
|
-0.173
|
Chloroplast
|
CsaV3_1G039750.1
|
CsPP2C7
|
380
|
41.393
|
5.24
|
54.5
|
78.47
|
-0.323
|
Chloroplast
|
CsaV3_2G003570.1
|
CsPP2C8
|
357
|
39.617
|
5.27
|
48.44
|
76.78
|
-0.341
|
Nuclear
|
CsaV3_2G006810.1
|
CsPP2C9
|
367
|
40.752
|
6.08
|
40.34
|
78.88
|
-0.283
|
Nuclear
|
CsaV3_2G010540.1
|
CsPP2C10
|
484
|
53.551
|
5.74
|
42.41
|
74.32
|
-0.508
|
Chloroplast
|
CsaV3_2G012660.1
|
CsPP2C11
|
275
|
31.235
|
9.51
|
43.29
|
76.22
|
-0.471
|
Chloroplast
|
CsaV3_2G016210.1
|
CsPP2C12
|
397
|
44.019
|
8.96
|
46.3
|
87.41
|
-0.313
|
Chloroplast
|
CsaV3_2G024970.1
|
CsPP2C13
|
424
|
46.836
|
8.25
|
53.61
|
85.94
|
-0.302
|
Cytoplasmic
|
CsaV3_2G033210.1
|
CsPP2C14
|
309
|
34.579
|
6.29
|
46.91
|
82.33
|
-0.384
|
Cytoplasmic
|
CsaV3_3G000550.1
|
CsPP2C15
|
390
|
42.936
|
7.7
|
46.19
|
90.44
|
-0.219
|
Nuclear
|
CsaV3_3G001890.1
|
CsPP2C16
|
813
|
89.039
|
5.29
|
46.54
|
76.17
|
-0.473
|
Nuclear
|
CsaV3_3G003600.1
|
CsPP2C17
|
523
|
57.737
|
5.39
|
50.39
|
75.3
|
-0.409
|
Nuclear
|
CsaV3_3G013890.1
|
CsPP2C18
|
414
|
45.372
|
5.27
|
33.49
|
84.69
|
-0.282
|
Chloroplast
|
CsaV3_3G014600.1
|
CsPP2C19
|
521
|
56.883
|
5.31
|
42.17
|
89.08
|
-0.227
|
Nuclear
|
CsaV3_3G016530.1
|
CsPP2C20
|
421
|
45.192
|
5.61
|
68.05
|
74.8
|
-0.302
|
Nuclear
|
CsaV3_3G019720.1
|
CsPP2C21
|
387
|
42.372
|
5.39
|
42
|
90.03
|
-0.105
|
Chloroplast
|
CsaV3_3G022030.1
|
CsPP2C22
|
349
|
38.668
|
5.54
|
36.64
|
78.51
|
-0.406
|
Chloroplast
|
CsaV3_3G027970.1
|
CsPP2C23
|
233
|
26.258
|
6.76
|
36.69
|
81.2
|
-0.488
|
Chloroplast
|
CsaV3_3G035300.1
|
CsPP2C24
|
370
|
41.113
|
8.84
|
39.84
|
91.68
|
-0.326
|
Nuclear
|
CsaV3_3G038810.1
|
CsPP2C25
|
390
|
43.875
|
7.18
|
7.18
|
88.46
|
-0.299
|
Chloroplast
|
CsaV3_3G043720.1
|
CsPP2C26
|
424
|
46.135
|
5.61
|
41.4
|
91.46
|
-0.119
|
Chloroplast
|
CsaV3_3G047510.1
|
CsPP2C27
|
281
|
31.084
|
6.51
|
36.8
|
88.86
|
-0.351
|
Chloroplast
|
CsaV3_4G009460.1
|
CsPP2C28
|
236
|
26.175
|
6.07
|
40.16
|
87.63
|
-0.453
|
Cytoplasmic
|
CsaV3_4G025000.1
|
CsPP2C29
|
712
|
79.584
|
5.76
|
40.03
|
72.46
|
-0.574
|
Nuclear
|
CsaV3_4G026720.1
|
CsPP2C30
|
446
|
49.889
|
8.55
|
39.04
|
71.41
|
-0.591
|
Nuclear
|
CsaV3_4G033570.1
|
CsPP2C31
|
283
|
31.385
|
5.88
|
37.1
|
81.98
|
-0.398
|
Cytoplasmic
|
CsaV3_4G034220.1
|
CsPP2C32
|
428
|
46.643
|
5.49
|
48.53
|
79.25
|
-0.303
|
Chloroplast
|
CsaV3_4G035500.1
|
CsPP2C33
|
325
|
35.755
|
5.32
|
44.62
|
86.4
|
-0.26
|
Nuclear
|
CsaV3_4G036320.1
|
CsPP2C34
|
293
|
31.576
|
5.07
|
43.56
|
78.26
|
-0.313
|
Cytoplasmic
|
CsaV3_4G036470.1
|
CsPP2C35
|
364
|
39.767
|
5.22
|
33.4
|
73.46
|
-0.374
|
Chloroplast
|
CsaV3_4G037450.1
|
CsPP2C36
|
386
|
41.981
|
5.31
|
54.25
|
82.9
|
-0.199
|
Nuclear
|
CsaV3_5G006460.1
|
CsPP2C37
|
389
|
43.513
|
7.24
|
42.56
|
93.7
|
-0.23
|
Chloroplast
|
CsaV3_5G010270.1
|
CsPP2C38
|
363
|
39.279
|
6.68
|
58.25
|
80.83
|
-0.269
|
Chloroplast
|
CsaV3_5G034510.1
|
CsPP2C39
|
433
|
46.590
|
8.64
|
39.69
|
85.36
|
-0.194
|
Chloroplast
|
CsaV3_6G000080.1
|
CsPP2C40
|
372
|
40.917
|
5.27
|
54.21
|
91.48
|
-0.177
|
Cytoplasmic
|
CsaV3_6G001340.1
|
CsPP2C41
|
402
|
44.641
|
5.91
|
61.87
|
73.71
|
-0.533
|
Chloroplast
|
CsaV3_6G003780.1
|
CsPP2C42
|
400
|
44.337
|
8.83
|
32.12
|
78
|
-0.416
|
Cytoplasmic
|
CsaV3_6G005520.1
|
CsPP2C43
|
403
|
44.260
|
4.5
|
31.29
|
83.4
|
-0.268
|
Cytoplasmic
|
CsaV3_6G016880.1
|
CsPP2C44
|
715
|
79.492
|
5.73
|
37.16
|
82.64
|
-0.473
|
Nuclear
|
CsaV3_6G022710.1
|
CsPP2C45
|
349
|
38.860
|
8.59
|
42.43
|
77.59
|
-0.406
|
Nuclear
|
CsaV3_6G028560.1
|
CsPP2C46
|
377
|
42.560
|
6.34
|
41.69
|
89.2
|
-0.294
|
Cytoplasmic
|
CsaV3_6G031110.1
|
CsPP2C47
|
275
|
30.192
|
5.24
|
36.02
|
81.93
|
-0.268
|
Chloroplast
|
CsaV3_6G032130.1
|
CsPP2C48
|
553
|
59.388
|
4.7
|
48.54
|
85.17
|
-0.158
|
Endoplasmic reticulum
|
CsaV3_6G047490.1
|
CsPP2C49
|
398
|
44.604
|
7.27
|
53.52
|
88.87
|
-0.24
|
Chloroplast
|
CsaV3_6G052400.1
|
CsPP2C50
|
275
|
30.748
|
4.9
|
51.66
|
84.73
|
-0.446
|
Cytoskeleton
|
CsaV3_7G001180.1
|
CsPP2C51
|
287
|
31.871
|
9.61
|
46.22
|
97.42
|
-0.137
|
Chloroplast
|
CsaV3_7G004290.1
|
CsPP2C52
|
471
|
51.927
|
5.12
|
50.33
|
69.94
|
-0.417
|
Chloroplast
|
CsaV3_7G005840.1
|
CsPP2C53
|
291
|
31.923
|
8.76
|
30.42
|
90.1
|
-0.331
|
Cytoplasmic
|
CsaV3_7G007970.1
|
CsPP2C54
|
382
|
42.507
|
8.51
|
47.81
|
89.06
|
-0.277
|
Chloroplast
|
CsaV3_7G030300.1
|
CsPP2C55
|
393
|
43.077
|
4.64
|
52.99
|
89.52
|
-0.161
|
Cytoplasmic
|
CsaV3_7G031840.1
|
CsPP2C56
|
367
|
40.317
|
5.05
|
38.99
|
68.88
|
-0.496
|
Chloroplast
|
Chromosome distribution and collinearity analysis of the PP2C gene family in cucumber
To obtain the position of CsPP2C genes on the chromosome information, using the TBtools to map the chromosomal location (Fig. 1). A total of 56 PP2C genes were anchored to corresponding chromosomes and designated as CsPP2C1–CsPP2C56 according to their order on the chromosomes, among which chromosome 3 and 6 were more distributed, and chromosome 5 was the least, with only 3 PP2C genes. Closely related genes located within a distance of less than 200 kb on the same chromosome are defined as tandem duplications, otherwise they are segmental duplications[33]. To further understand the expansion mechanism of the CsPP2Cs, we examined segmental and tandem duplications within the cucumber genome. Our results showed that the PP2C gene family had no tandem duplication gene pairs, but there were 7 fragments repeat gene pairs (Fig. 2a). In the seven pairs of collinear relationships, CsPP2C49 was paired with CsPP2C15 and CsPP2C12, respectively, while the others are one-to-one paired.
In addition, we also detected homologous PP2C gene pairs between cucumber and Arabidopsis. There were 59 collinear gene pairs between 41 CsPP2Cs and 48 AtPP2Cs (Fig. 2b). The maximum number of homologous genes in cucumber was 11 pairs on chromosome 3, while the minimum number was 3 pairs on chromosome 5. According to this result, we speculated that cucumber and Arabidopsis may have high homology and common ancestors.
Analysis of dN/ds values of PP2Cs in cucumber, cucumber and Arabidopsis
To further investigate the divergence and selection in duplication of PP2C genes, the non-synonymous substitution rate (dN), synonymous substitution rate (ds) and dN/ds values were evaluated for the homologous gene pairs among cucumber, cucumber and Arabidopsis (Table S1). When dN/dS >1 is the positive selection, dN/dS = 1 is the neutral selection, 0 < dN/dS < 1 is purifying selection [34]. The dN/ds value of all cucumber gene pairs was less than 1. Similarly, the dN/ds value of all collinear gene pairs in cucumber and Arabidopsis was less than 1. These data suggest that these genes were mainly under the purifying selection during evolution and could help to maintain the basic function of this gene.
Phylogenetic analysis of CsPP2C genes
In order to investigate the phylogenetic relationships of PP2C genes between cucumber and Arabidopsis, using the maximum likelihood (ML) method constructed a phylogenetic tree based on PP2C genes of 80 in Arabidopsis and 56 in cucumber (Fig. 3). The phylogenetic analyses indicated that each subfamily includes PP2C protein from cucumber and Arabidopsis, and the genes of cucumber and Arabidopsis tend to form independent branches in each subgroup, that is, cucumber genes clustered together, and Arabidopsis genes clustered together. The 56 CsPP2C proteins were divided into 13 subgroups (A-L), while CsPP2C1, 21, 11 were not clustered with any other group. This was similar to the grouping of PP2C in Arabidopsis and rice. Each group included 7, 4, 4, 9, 9, 5, 3, 4, 3, 3, 0, 0, 2 CsPP2C genes. Except for subgroup J and K (only AtPP2C gene), the distribution of PP2Cs in cucumber and Arabidopsis subgroups was similar. This suggested that the PP2C gene family may have evolved from a common ancestor.
Gene structural and protein domain analyses of CsPP2C
Since the pattern diversity of exon/intron structure and protein domain plays an important role in the evolution of gene families, we studied the exon/intron structure patterns of CsPP2C genes and conserved domain based on their phylogenetic relationships (Fig. 4a). Studies on exon / intron structure showed that most members of the same subfamily have similar exon / intron numbers but differ in length (Fig. 4b). The structure of CsPP2C gene in each group was basically similar, but there were differences in exon / intron arrangement of some genes. For example, in group F, CsPP2C50 3’ end has the longest non-coding region and the number of exons in group F2 (8 exons) was almost twice that in group F1 (4-5 exons). In addition, CsPP2C35 in group I has 10 exons and the longest gene fragment is more than 12 kp. CsPP2C40 in group G had no non-coding regions, and only two exons, which was the lowest number of exons in all PP2C genes. The 7 PP2C genes have no noncoding regions (CsPP2C14, 40, 3, 24, 37, 15, 51), and most of them were located in group D. This indicated that the CsPP2C gene was relatively conservative in the process of evolution, which ensures the integrity of the gene structure, so that there is little change in its function.
To identify common motifs among different groups of CsPP2C proteins, we used the MEME motif search tool to Identify 10 conserved motifs (Table 2). As shown in Figure 4c, proteins in the same group exhibited similar motif distribution patterns. Motif 1, 2 (except CsPP2C3, 23), 3 (except CsPP2C51), 4, 6, 7 were found in all CsPP2C genes. In addition to the common motif, there are specific motifs in each group. For example, motif 8 was not present in group C, but was present in all other groups. Motif 5 was present in group C and group D, but not in the other groups. Motif 9 was not found in group C, group D, and group H, while it was found in all other groups. According to these results, the CsPP2C genes in the same subgroup had similar conserved motif composition and distribution, suggesting that the CsPP2C members in the same cluster likely share similar functions.
Table 2
Conserved motifs in the amino acid sequences of CsPP2C proteins.
Motif
|
Width Multilevel
|
consensus sequence
|
1
|
29
|
SGSTALVALIQGDTLYVANVGDSRAVLAR
|
2
|
29
|
LTPEDEFLILASDGLWDVLSNZEAVDIVR
|
3
|
15
|
AFFGVFDGHGGPGAA
|
4
|
17
|
GGLAVSRAIGDFYLKQY
|
5
|
50
|
PRNGSAKRLVKAALQEAAKKREMRYSDLKKIDRGVRRHFHDDITVIVVFL
|
6
|
15
|
AIQLSVDHKPSREDE
|
7
|
20
|
WEKAJKKAFLKTDEEFLSLV
|
8
|
15
|
RGSKDBISVIVVQFK
|
9
|
16
|
QGKRGEMEDAHIVWED
|
10
|
27
|
AERIKQCKGRVFALQDEPEVYRVWLPN
|
Cis-element analysis of the CsPP2Cs promoter in cucumber
Abundant responsive regulatory elements are found in the promoter regions of CsPP2Cs through the PlantCARE analysis (Fig. 5). The cis-elements screened could be divided into two categories. The first type of element was hormone response elements, such as TCA-element (salicylic acid response element), the ABRE (ABA response element), the TGA-element (auxin response element), the CGTCA-motif and TGACG-motif (MeJA-responsiveness response element), the P-box (gibberellin response element), among others. The second type of element was stress response elements, such as LTR, MYC, TC-rich repeats and MBS. The abscisic acid ABA-responsive (ABRE) elements were identified abundantly in the promoter regions of CsPP2Cs, among which CsPP2C2 and CsPP2C3 contained 12 ABA-response; it was the most elements in the promoter region. The second was the MYC element, which only does not exist in CsPP2C12-13, CsPP2C32-34, CsPP2C42-43, CsPP2C45, 46, 48, and was most in CsPP2C1. This suggests that most CsPP2C genes may respond to various abiotic stresses
Tissue-specific expression profiles of CsPP2C genes
In order to better understand the role of CsPP2C genes in cucumber growth and development, the temporal and spatial expression patterns of CsPP2C genes was analyzed by using RNA-seq data of different tissues of cucumber cultivar: Chinese long '9930' (Fig. 6). Only CsPP2C11, 41, 5, 33, 50 were low expressed in all ten tissues. On the contrary, the expression levels of other CsPP2C genes were high in the fertilized ovaries, male, female and leaf but low in other organs, such as CsPP2C12, 51, 15, 46, 37, 31, 22, 47. Moreover, the expression level of CsPP2C49 was medium in males but low in other tissues. Similarly, the expression level of CsPP2C53 was medium in females and males but low in other tissues. In general, most cucumber CsPP2C gene showed similar expression patterns in different tissues.
Response of CsPP2C genes expression to various abiotic stresses and ABA treatment
Several members of group A PP2Cs have been shown to function as negative regulators of ABA signaling pathway in Arabidopsis. The expression of 7 PP2C genes in Arabidopsis was suppressed by ABA treatment, and 2 of them are the members of subfamily D, the PP2C genes in different subfamilies might play different functional roles in distinct signaling pathways. Therefore, we selected 12 PP2C genes in cucumber that were homology with genes response to stress in Arabidopsis. Their expressions under ABA, salt, drought and cold treatments were analyzed by qRT-PCR.Under ABA treatment (Fig.7 a), we observed that the expression of CsPP2C3, 15, 18, 22, 39 continually increased with theextension of ABA treatment time. Among them, the expression level of CsPP2C3 was the highest at 24 h, 35 times that of 0h. On the contrary, the expression levels of CsPP2C4, 5, 52 had decreasing trends.The expression levels ofCsPP2C19 were increased at 6 and 12 h but decreased at 24 h.Compared with 0h, the expression of CsPP2C28 was up-regulated at 6h and 24h while the expression of CsPP2C2 was almost unchanged.As shown in Fig.7 b, under 10%PEG treatment, CsPP2C3, 15, 19, 22, 39 and 40 were up-regulated,among which CsPP2C3 and CsPP2C15 were up-regulated most obviously and were more than 5 times of 0 h after 6h and 24h treatment. The relative expression levels of other genes were down-regulated, especiallyCsPP2C18. The expression of CsPP2C18 was up-regulated and CsPP2C39 was down-regulated with the extension of treatment timeunder salt treatment (Fig.7 c). The relative expression of other genes was similar to drought treatment. Interestingly, the relative expression of most PP2C genes was up-regulated and then gradually down-regulated, such as CsPP2C3-5,18,19,22,28,52under cold stress (Fig.7 d). Compared with 0 h, the up-regulation of these genes was not high, and only CsPP2C3 had the highest up-regulation range at 6h, more than 3 times that of 0h, while CsPP2C2 had the most significant down-regulation. Under ABA, drought and salt treatments, the expression level of CsPP2C3 was significantly up-regulated, which may play an important role in abiotic stress. These results provide a basis for the functional study of PP2C gene in the future.