Identification of GGPS gene family members and analysis of protein physicochemical properties in potatoes
Eleven GGPS (StGGPS) gene family members were identified in potatoes using HMMER 3.1 in conjunction with the SMART, Pfam, and NCBI CDD databases. The StGGPS genes were sequentially named (StGGPS1-StGGPS11). BioPerl analysis showed that the amino acid length of StGGPS family members ranges from 294 to 398 residues, with an average length of 351 residues. The molecular weight varies from 32,784.6 to 43,490.4 Da, with an average molecular weight of 38,973.8 Da. The theoretical isoelectric point ranges from 4.87 and 8.38, with an average of 6.41. Among the StGGPS genes, six are primarily localized in the chloroplast, two are found in the cytoplasm, nucleus, and mitochondria, and one is associated with the cytoskeleton (Table 1).
Table 1 Analysis of potato GGPS gene family members and physicochemical properties
Gene ID
|
Gene Name
|
Chromosome localization
|
Amino acid length(aa)
|
Molecular
Weight(Da)
|
Theoretical
isoelectric
point
|
Subcellular localization
|
PGSC0003DMG400047044
|
StGGPS1
|
chr02
|
40939136
|
40940224
|
+
|
362
|
39321
|
6.35
|
Chloroplast
|
PGSC0003DMG400041508
|
StGGPS2
|
chr02
|
40941168
|
40942286
|
+
|
372
|
41120.1
|
8.38
|
Mitochondrion
|
PGSC0003DMG400043267
|
StGGPS3
|
chr02
|
40945073
|
40946191
|
+
|
372
|
40731.9
|
7.92
|
Chloroplast
|
PGSC0003DMG400027856
|
StGGPS4
|
chr04
|
69344146
|
69345467
|
-
|
375
|
40657.5
|
4.99
|
Chloroplast
|
PGSC0003DMG400007081
|
StGGPS5
|
chr07
|
52190753
|
52195395
|
-
|
398
|
43490.4
|
6.01
|
Chloroplast
|
PGSC0003DMG400022214
|
StGGPS6
|
chr07
|
54896132
|
54897257
|
-
|
294
|
32784.6
|
7.01
|
Cytoskeleton
|
PGSC0003DMG400002687
|
StGGPS7
|
chr09
|
4028005
|
4032971
|
+
|
334
|
36484.6
|
5.26
|
Chloroplast
|
PGSC0003DMG400008690
|
StGGPS8
|
chr10
|
854194
|
861002
|
+
|
342
|
39650.3
|
6.12
|
Cytoplasm
|
PGSC0003DMG400014369
|
StGGPS9
|
chr10
|
871271
|
879652
|
+
|
306
|
35198.2
|
6.77
|
Nucleus
|
PGSC0003DMG400015673
|
StGGPS10
|
chr11
|
1682159
|
1683721
|
-
|
365
|
39976.7
|
6.85
|
Chloroplast
|
PGSC0003DMG400029788
|
StGGPS11
|
chr12
|
10169940
|
10174778
|
-
|
342
|
39296.9
|
4.87
|
Cytoplasm
|
Potato GGPS family evolution analysis, gene structure and conserved motifs
To explore the evolution of the GGPS gene family, a phylogenic tree was constructed using genes from Arabidopsis (Arabidopsis thaliana), potato (Solanum tuberosum), tobacco (Nicotiana attenuata), tomato (Solanum lycopersicum), and hot pepper (Capsicum annuum) (Fig.1). The GGPS genes were divided into three subfamilies: Group 3, with seven potato StGGPS members; Group 2, with three potato StGGPS members; and Group 1, with one potato StGGPS member. Members within the same branch likely have similar functions and evolutionary relationships.
A better understanding of StGGPS evolutionary relationships between genes further analyzes the potato StGGPS member intron-explicit substructure and conservative motifs (Fig.1). The genetic structure of different StGGPS members varies; some genes do not contain introns, whereas others contain up to 11 introns. The diversity of the gene structures suggests that the StGGPS genes may have undergone different selection events during evolution. The StGGPS genes in the same subfamily generally had similar gene structures according to the phylogenetic tree, indicating that the intron-exon gene structure of the StGGPS genes was highly correlated with their phylogenetic relationships. The conserved base sequence is usually associated with the function of proteins, and MEME was used to identify 10 Motif (Motif1 ~ Motif10). The number of motifs per StGGPS member ranged from three to seven. For example, StGGPS5 contains a minimum of three motifs, whereas StGGPS2 and StGGPS3 contain a maximum of seven motifs. Although some motifs are missing in certain StGGPS members, all genes share a conserved pattern, particularly Motif3, which is highly conserved among the StGGPS genes. Analysis of the three-dimensional structures revealed that proteins within the same subfamily exhibit higher structural similarity compared to those across different subfamilies. For example, StGGPS8, StGGPS9, and StGGPS11 from Group 2 have similar structures, as do StGGPS1, StGGPS2, StGGPS3, StGGPS4, and StGGPS10 from Group3 (Fig.2). This combined analysis of the phylogenetic tree, gene structure, and conserved motifs suggests that StGGPS members within the same subfamily share similar gene functions and protein structures, indicating similar roles. We hypothesized that similar gene structures and conserved motifs in StGGPS would indicate similar functions and roles.
Fig.1 Phylogenetic, gene structure and conserved motif analysis of the GGPS gene family. A:Phylogenetic tree of the GGPS gene family in Arabidopsis_thaliana、Solanum_tuberosum、Nicotiana_attenuata、Solanum_lycopersicum、Capsicum_annuum.B: Phylogenetic tree of the StGGPS gene family.C: Conserved motif analysis of the StGGPS gene.D: Analysis of StGGPS gene structure.
Fig.2 Three-dimensional structural modeling of proteins of the StGGPS gene family.
Chromosome localization of genes and analysis of gene duplication events
StGGPS family members on the chromosome location information were extracted from the genome annotation file using Tbtools II. We found that chromosome 2 contained three StGGPS genes; chromosomes 7 and 10 contained two StGGPS genes; and chromosomes 4, 9, 11, and 12 contained only one StGGPS gene (Fig.3A).
To explore the structural and functional characteristics of the StGGPS genes, we examined gene duplications through tandem repeats and segmental duplications. We identified StGGPS gene pairs resulting from duplication events, specifically: StGGPS1: StGGPS4 (Fig.3B). We found three pairs of tandem repeat genes: StGGPS1, StGGPS2, StGGPS2, StGGPS3, StGGPS8, and StGGPS9. Gene duplication and fragment duplication likely drove the evolution of StGGPS genes. Further analysis of the Ka/Ks ratios for each pair indicated that the ratios ranged from 0.0634 (StGGPS1 / StGGPS4) to 0.6052 (StGGPS8 / StGGPS9), suggesting selective pressure on these genes.
To understand the evolutionary mechanisms of the potato GGPS family, we performed linear analysis using Arabidopsis and Solanaceae plants (tobacco, tomatoes, peppers, and potatoes) (Fig.3C). Four pairs of homologous genes were identified between potato and Arabidopsis thaliana. Two homologous pairs were found between potatoes and tobacco, eight homologous genes between potatoes and tomatoes, and seven between potatoes and peppers. These homologous genes likely existed before the divergence of ancestral lineages.
Fig.3 Gene duplication events in the GGPS family of S. tuberosum. A:Chromosomal localization map of StGGPS gene.B:Covariance analysis of StGGPS gene.C: Collinearity analysis of GGPS genes in multiple species.
Analysis of cis-acting elements in StGGPS gene promoters
To explore the potential functions of StGGPS genes, we analyzed the 2000 bp upstream promoter sequence. We identified 10 cis-acting elements (Fig.4) associated with various physiological processes. Notably, elements such as Box4, G-box, GATA-motif, GT1-motif, and TCT-motif, which are related to light sensing, were prevalent. This suggests that the StGGPS gene family is involved in potato photosynthesis. Additionally, StGGPS genes associated with abiotic stress responses contained cis-elements such as ABRE (involved in abscisic acid response) ARE (anaerobic response), CGTCA-motif and TGACG-motif (jasmonic acid response), and TGA (auxin response). Overall, StGGPS genes likely play significant roles in light and abiotic stress responses in potatoes.
Fig.4 Original analysis of the cis-acting 2000 bp promoter of the StGGPS gene.
Expression patterns of genes in tubers of different colors
Different colored potato tubers exhibit obvious differences in their carotenoid content[33]. To further investigate the expression of StGGPS in tubers of different colors, RNA-seq data were used to analyze the expression of StGGPS genes in potato blocks of the four different colors (Fig.5). Results indicated differential expression of StGGPS genes based on tuber color. In yellow tubers, StGGPS1, StGGPS3, StGGPS4, and StGGPS7 were upregulated. In white tubers, StGGPS2, StGGPS5, StGGPS8, and StGGPS9 showed increased expression. In red tubers, StGGPS6, StGGPS10, and StGGPS11 were upregulated, while in purple tubers, StGGPS gene expression was downregulated. This suggests that yellow and red tubers may have higher carotenoid content, while purple tubers may have lower carotenoid content and higher anthocyanin content compared to other tubers.
Fig.5 Expression of StGGPS gene in different colored tubers (P: purple potato; R: red potato; W: white potato; Y: yellow potato).
Gene expression patterns in response to abiotic stress
Abiotic stresses, such as drought and salt stress, significantly impact potato production. To investigate the response of the StGGPS family to these stresses, we analyzed RNA-seq data from NCBI, validated by RT-qPCR. Under drought stress, the RNA-seq data revealed differential expression of StGGPS genes. Specifically, StGGPS9 and StGGPS10 were upregulated, whereas other StGGPS members were downregulated. RT-qPCR verification confirmed that StGGPS10 was particularly sensitive to drought stress, suggesting it plays a critical role in enhancing drought resistance in potatoes(Fig.6). In response to salt stress, RNA-seq data indicated differential expression of several StGGPs, including StGGPS3, StGGPS4, StGGPS9, StGGPS10, StGGPS2, StGGPS5, StGGPS6, and StGGPS11. RT-qPCR results confirmed that StGGPS6 also responded to salt stress (Fig.7). Based on these findings, we hypothesize that StGGPS6 and StGGPS10 play important roles in the abiotic stress response of potatoes.
Fig.6 Expression of StGGPS gene under drought stress. A:Expression of StGGPS under drought stress.B:Relative expression of StGGPS under drought stress.
Fig.7 Expression of StGGPS gene under salt stress. A:Expression of StGGPS under salt stress.B:Relative expression of StGGPS under salt stress.