Genome-wide identification and sequence analyses of SATs
A total of six non-redundant SAT genes were identified in rice genome using five AtSAT protein sequences as references: LOC_Os01g52260 (OsSAT1;1), LOC_Os02g10830 (OSSAT1;2), LOC_Os03g04140 (OsSAT3), LOC_Os03g08660 (OsSAT2;1), LOC_Os03g10050 (OsSAT2;2), and LOC_Os05g45710 (OsSAT1;3) (Table 2). The protein lengths of SATs are ranged from 298 to 391 amino acid residues; molecular weights were found between 30.44 and 42.72 kDa. All SATs are predicted to be acidic character (pI≤7) except for AtSAT3. Exon numbers range from one to 10. AtSAT2 and 4 (SAT3 family) and OsSAT3 have 10 exons. All SAT proteins in the study contain the serine acetyltransferase N-terminal domain structure (SATase_N, PF06426). In addition, bacterial transferase hexapeptide (PF00132) domain was identified as one or two repeats.
Table 2
Properties of SAT members in Arabidopsis and rice.
Phytozome ID
|
Species
|
Exon
no
|
Protein
length
(aa)
|
Domain
family
|
Repeat number |
Mol. wt.
(kDa)
|
pI |
At1g55920 (SAT1, SERAT2;1)
|
A. thaliana
|
1
|
314
|
PF06426
|
2
|
34.25
|
6.45
|
At2g17640 (SAT2, SERAT3;1)
|
A. thaliana
|
10
|
323
|
PF06426
|
2
|
34.53
|
5.81
|
At3g13110 (SAT3, SERAT2;2)
|
A. thaliana
|
1
|
391
|
PF06426
|
1
|
42.72
|
7.73
|
At4g35640 (SAT4, SERAT3;2)
|
A. thaliana
|
10
|
355
|
PF06426
|
1
|
38.42
|
5.56
|
At5g56760 (SAT5, SERAT1;1)
|
A. thaliana
|
2
|
312
|
PF06426
|
2
|
32.77
|
6.70
|
LOC_Os01g52260 (OsSAT;1)
|
O. sativa
|
1
|
303
|
PF06426
|
2
|
31.85
|
6.66
|
LOC_Os02g10830 (OSSAT1;2)
|
O. sativa
|
1
|
298
|
PF06426
|
2
|
30.44
|
5.33
|
LOC_Os03g04140 (OsSAT3)
|
O. sativa
|
10
|
354
|
PF06426
|
1
|
37.52
|
6.37
|
LOC_Os03g08660 (OsSAT2;1)
|
O. sativa
|
1
|
301
|
PF06426
|
1
|
31.50
|
5.91
|
LOC_Os03g10050 (OsSAT2;2)
|
O. sativa
|
1
|
315
|
PF06426
|
1
|
32.92
|
6.20
|
LOC_Os05g45710 (OsSAT1;3)
|
O. sativa
|
1
|
314
|
PF06426
|
1
|
33.09
|
6.38
|
PF06426 (SATase_N): Serine acetyltransferase, N-terminal, Repeat (PF00132): Bacterial transferase hexapeptide (six repeats) |
To provide more insight about protein sequence diversity of SATs, sequence identity matrix was constructed (Table 3). The lowest identity value was found 0.325 between AtSAT2 and AtSAT3 whilst the highest value was found between OsSAT2;1 and OsSAT2;2 as 0.914. The mean identity value of SAT proteins was 0.476.
Table 3
Sequence identity matrix of SATs in Arabidopsis and rice
|
AtSAT1
|
AtSAT2
|
AtSAT3
|
AtSAT4
|
AtSAT5
|
Os1
|
Os2
|
Os3
|
Os4
|
Os5
|
Os6
|
AtSAT1
|
ID
|
|
|
|
|
|
|
|
|
|
|
AtSAT2
|
0.390
|
ID
|
|
|
|
|
|
|
|
|
|
AtSAT3
|
0.611
|
0.325
|
ID
|
|
|
|
|
|
|
|
|
AtSAT4
|
0.393
|
0.700
|
0.341
|
ID
|
|
|
|
|
|
|
|
AtSAT5
|
0.495
|
0.425
|
0.409
|
0.407
|
ID
|
|
|
|
|
|
|
Os1
|
0.503
|
0.417
|
0.426
|
0.422
|
0.685
|
ID
|
|
|
|
|
|
Os2
|
0.477
|
0.398
|
0.393
|
0.393
|
0.609
|
0.741
|
ID
|
|
|
|
|
Os3
|
0.386
|
0.547
|
0.358
|
0.563
|
0.408
|
0.405
|
0.400
|
ID
|
|
|
|
|
Os4
|
0.473
|
0.367
|
0.383
|
0.356
|
0.444
|
0.506
|
0.498
|
0.356
|
ID
|
|
|
Os5
|
0.496
|
0.374
|
0.406
|
0.369
|
0.476
|
0.523
|
0.507
|
0.372
|
0.914
|
ID
|
|
Os6
|
0.498
|
0.398
|
0.407
|
0.407
|
0.703
|
0.818
|
0.700
|
0.400
|
0.481
|
0.501
|
ID
|
Os1: LOC_Os01g52260 (OsSAT1;1), Os2: LOC_Os02g10830 (OsSAT1;2), Os3: LOC_Os03g04140 (OsSAT3), Os4: LOC_Os03g08660 (OsSAT2;1), Os5: LOC_Os03g10050 (OsSAT2;2), Os6: LOC_Os05g45710 (OsSAT1;3).
|
Conserved motif analysis
Regarding conserved motifs, 10 conserved motifs were detected (Table 4 and Fig. 1). The motif 3 was found to be related with PF00132 (Bacterial transferase hexapeptide). Motif 4 and 5 were related with PF06426 (serine acetyltransferase) domain structure. The motif 1, 2, 3, 4, and 5 were present in all SATs whereas other motifs were found in different numbers.
Table 4
The details of conserved motif sequences in SAT proteins
No.
|
Motif sequence
|
S
|
W
|
Domain
|
1
|
IGKGILLDHATGVVIGETAVVGBNVSILHGVTLGGTGKESGDRHPKIGDG
|
11
|
50
|
NF
|
2
|
YSHCLLNYKGFLALQAHRVAHKLWAQGRKALALALQSRVSEVFAVDIHPA
|
11
|
50
|
NF
|
3
|
VLIGAGATILGNVKIGAGAKIGAGSVVLKDVPPRTTAVGNPARLIGGKDE
|
11
|
50
|
PF00132
|
4
|
WDQIKAEAKRDAEKEPILSSFLYASVLSHPSLERALAFHLANKLCNPTLL
|
11
|
50
|
PF06426
|
5
|
FAGVLAAHPEJRAAVRADLLAAKDRDPAC
|
11
|
29
|
PF06426
|
6
|
IPGESMDHTSFISEWSDYTI
|
6
|
20
|
NF
|
7
|
QDPSLTMKHDATREFFQHVAVAYKDDKPN
|
3
|
29
|
NF
|
8
|
MAACIDKWPTGKPQ
|
4
|
14
|
NF
|
9
|
RLPEKFYCVLPDCTATDRPV
|
2
|
20
|
NF
|
10
|
TQLYDL
|
11
|
6
|
NF
|
W: width, S: sites, NF: not found, PF00132: Bacterial transferase hexapeptide (six repeats), PF06426: Serine acetyltransferase, N-terminal
|
Nucleotide and phylogenetic analyses
R value, the ratio of transition to transversion, were estimated to provide insight into DNA sequence evolution and phylogeny reconstruction. Also, G+C contents of OsSATs were calculated to predict probable functional variations of the genes which are important for organisms in gaining adaptation to its environment. The estimated transition/transversion bias (R) was found as 0.71, indicating genetic variations. G+C contents were found 71.82%, 75.25%, 51.92%, 72.63%, 72.47%, and 68.68% for OsSAT1;1, OSSAT1;2, OsSAT3, OsSAT2;1, OsSAT2;2, and OsSAT1;3, respectively.
Phylogenetic analysis indicated that AtSATs and OsSATs split into two major clades (Group A and B). Group A also divided into two subclades (Fig. 2). OsSAT2;1 (LOC_Os03g08660) and OsSAT2;2 (LOC_Os03g10050) clustered with AtSAT1 (SAT2;1) and AtSAT3 (SAT2;1) in subclade A1, whereas OsSAT1;3 (LOC_Os05g45710), OsSAT1;1 (LOC_Os01g52260), and OsSAT1;2 (LOC_Os02g10830) grouped with AtSAT5 (SAT1;1) in subgroup A2. OsSAT3 (LOC_Os03g04140) gene clustered with AtSAT2 (SAT3;1) and AtSAT4 (SAT3;2) and separated from rest of SATs in group B (100%), indicating genetic divergence. This separation is related to OsSAT3 having the highest identity scores, in turn, with AtSAT4 and AtSAT2.
Selection, gene duplication and synteny analyses
Nucleotide variations of six OsSAT genes were analyzed using two selection analyses: Tajima’s D and Ka/Ks tests. In Tajima’s D test, the number of polymorphic (segregating) sites in OsSAT genes were identified as 516, of which 47.1% (243/516) were singleton variable sites and 30.4% (273/516) were parsimony informative sites. 306 sites were invariable (monomorphic) and nucleotide diversity was found 0.33 and 0.27 for π and θ parameters, respectively. Tajima’s D was found 1.22, indicating purifying (negative) selection.
In the second selection test, the nonsynonymous (Ka) and synonymous (Ks) substitution rates between the duplicated gene pairs were calculated. Ka/Ks values for all OsSAT genes were found less than one, validating previous finding that OsSAT genes were subjected to purifying selection. Moreover, gene duplications analyses indicated that three segmental duplications and one tandem duplication occurred, suggesting that these duplications are the major force for SAT gene expansion (Table 5).
Table 5
Segmental and tandem duplications of SAT paralogous pairs in rice genome
SAT group
|
Chr. location
|
Duplication type
|
Ka
|
Ks
|
Ka/Ks
|
OsSAT1;1 OsSAT1;2
|
Chr1
Chr2
|
Segmental
|
0.114
|
0.184
|
0.619
|
OsSAT1;1 OsSAT1;3
|
Chr1
Chr5
|
Segmental
|
0.073
|
0.296
|
0.247
|
OsSAT1;2 OsSAT1;3
|
Chr2
Chr5
|
Segmental
|
0.146
|
0.363
|
0.402
|
OsSAT2;1 OsSAT2;2
|
Chr3
Chr3
|
Tandem
|
0.018
|
0.023
|
0.783
|
Non-synonymous (Ka) and synonymous (Ks) indicate the substitution rates. Ka/Ks is non-synonymous/synonymous mutation ratio |
To understand gene duplication dynamics of OsSAT genes four comparative syntenic maps of rice, associated with four representative species, were generated. (Figure 3). Six OsSAT genes had a syntenic relationship with five genes in Arabidopsis, followed by four genes in B. distachyon, four genes in tomato, and three genes in maize.
Distinct expression profiles of OsSAT genes
To better understand the functions of OsSAT genes in the cell metabolism, gene expression levels in different tissues and organs were displayed as a heatmap in Fig 4. For digital expression analysis, five OsSAT genes were obtained from Rice Expression Profile Database (RiceXPro) except for OsSAT1;2 (LOC_Os02g10830) gene. In general, OsSAT genes in stem, inflorescence, anther, pistil, lemma, palea, and embryo tissues/organ were expressed at low level. Particularly, OsSAT2;1 and OsSAT2;2 genes were down-regulated as about -3-fold change. Also, these two genes were downregulated in inflorescence and anther tissues. In leaf blade and sheath, OsSAT genes commonly were upregulated. OsSAT2;1 and OsSAT2;2 genes were upregulated about three-fold changes in leaf blade.
In the second step, gene expression profiles of OsSAT genes were evaluated at five different time points depending on six types of plant hormone applications (abscisic acid, gibberellin, auxin, brassinosteroid, cytokinin, and jasmonic acid) (Fig. 5). Jasmonic acid treatment especially causes upregulation of OsSAT genes in terms of gene expression levels. Particularly OsSAT1;1 gene was upregulated nearly four-fold by jasmonic acid. Also, it was found that the optimal time point for the expression to increase was one hour for OsSAT1;1 gene under jasmonic acid treatment. Also, OsSAT1;1 responded to the auxin and abscisic acid treatments at the third hour of exposure despite its responses were not as high as to that of jasmonic acid. The up-regulation of OsSAT1; 1 gene also was observed under abscisic acid and auxin treatments. OsSAT2;1 and OsSAT2;2 genes were upregulated at the first hour of exposure to jasmonic acid. Overall, levels of gene expressions under hormone treatments supported that OsSAT1;1, OsSAT2;1, and OsSAT2;2 genes were positively regulated by hormone treatments.
Co-expression analysis of OsSAT genes
The co-expression network of OsSAT genes was constructed using RiceFREND database. Co-expression network of OsSAT genes displayed that Os04g0577500, Os11g0524300, Os06g0167400, Os06g0690700, Os12g0641300, Os04g0488700, and Os07g0589000 were seven first neighbor genes that co-expressed with OsSATs (Fig. 6). Os04g0488700 (similar to PHY3, AGC kinase) was co-expressed with OsSAT1;1 gene. The AGC kinase family is one of seven kinase families and they are conserved in all eukaryotic genomes. AGC kinases in plants play roles in modulation of kinase activity by external stimuli [34]. Os12g0641300 (similar to Zn-dependent hydrolases of the beta-lactamase fold) was identified as a co-expressed gene with OsSAT1;3. Os07g0589000 (lateral organ boundaries, LOB domain containing protein) was co-expressed with OsSAT2;1 and OsSAT2;2. Lateral organ boundaries domain (LBD) proteins contain lateral organ boundaries (LOB) domain that are key regulators for plant organ development such as photomorphogenesis, plant regeneration, and pollen development [35]. OsSAT3 was co-expressed with Os04g0577500 (TatD-related deoxyribonuclease family protein), Os11g0524300 (protein of unknown function DUF1001 family protein), Os06g0167400 (di-trans-poly-cis-decaprenylcistransferase family protein), and Os06g0690700 (similar to potential cadmium/zinc-transporting ATPase HMA1). TatD is conserved protein found in all living organisms and participates in DNA fragmentation during apoptosis in eukaryotic cells [36]. Heavy metal pumps (P1B-ATPases) are vital for cellular heavy metal homeostasis. Arabidopsis thaliana contains eight P1B-ATPase genes (heavy metal ATPases 1–8 (HMA1–HMA8) members [37].
Annotations of SAT proteins
The gene ontology (GO) analyses of OsSAT proteins were performed using PANNZER server, in terms of biological process, molecular function, and cellular component (Fig. 7). Sulfur amino acid biosynthetic process (GO:0000097), L-serine metabolic process (GO:0006563), biosynthetic process from serine (GO:0006535), cellular amino acid biosynthetic process (GO:0008652), sulfate assimilation (GO:0000103), and response to sulfate starvation (GO:0009970) were identified as biological processes in which OsSATs are involved (Fig. 7A). Serine O-acetyltransferase activity (GO:0009001), terpene synthase activity (GO:0010333), magnesium ion binding (GO:0000287), zinc ion binding (GO:0008270), and protein binding (GO:0005515) were predicted molecular functions which are OsSATs are carried out (Fig. 7B). In addition, cytoplasm (GO:0005737), intrinsic component of membrane (GO:0031224), intracellular part (GO:0044424), and membrane-bounded organelle (GO:0043227) were identified as cellular components in which OsSATs function (Fig. 7C). When the results of GO analyses were evaluated, it was clearly observed that amino acid synthesis is the most prominent biological process for OsSATs.
Secondary and tertiary structure analyses of OsSATs
According to the secondary structure analyses, there are structural variations among OsSAT proteins. The alpha helix, extended strand, beta turn, and random coil percentages (%) were found between 36.94 -42.62, 17.46 - 21.19, 7.38 - 10.45, and 28.25 -34.65, respectively (Supp. Table 1). The predicted 3D structures of OsSAT proteins (Fig. 8) were found reliable due to and their Ramachandran values ranging from 96% to 99% in core and allowed regions. Homologous proteins from different organisms can be recognized using sequence comparison because amino acid substitutions in particular positions are prevented by strong selective constraints [38] . These structural variations at secondary and tertiary levels may be associated with SAT proteins’ functional flexibilities.
The 3D structural similarities (%) were identified using six rice, five Arabidopsis, and one soybean SAT proteins on CLICK structure comparison server (Table 6). In general, the 3D structural similarity values were identified between 68.36% and 88.45%. The highest (88.45%) and lowest (68.36%) similarity values were found in turn between OsSAT1;1 & AtSAT4 and OsSAT3 & GmSAT. When Arabidopsis and rice similarity values were compared, the similarity was observed over 80%, indicating well-conserved SAT protein structure. The protein having the lowest similarity values with other SATs was OsSAT3 protein, proving structural divergence among SATs [9].
Table 6
The 3D structure overlap (%) of rice, Arabidopsis, and soybean SATs using CLICK structure comparison server
|
Os1
|
Os2
|
Os3
|
Os4
|
Os5
|
Os6
|
AtSAT1
|
82.84
|
87.58
|
79.30
|
85.71
|
82.80
|
79.94
|
AtSAT2
|
83.50
|
87.25
|
84.21
|
85.38
|
84.76
|
81.85
|
AtSAT3
|
87.79
|
88.26
|
73.45
|
86.05
|
86.03
|
87.90
|
AtSAT4
|
88.45
|
88.26
|
72.88
|
86.71
|
84.44
|
84.71
|
AtSAT5
|
79.87
|
82.21
|
84.62
|
80.73
|
79.17
|
81.73
|
GmSAT
|
79.87
|
81.21
|
68.36
|
81.73
|
77.14
|
77.71
|
Os1: LOC_Os01g52260 (OsSAT1;1), Os2: LOC_Os02g10830 (OsSAT1;2), Os3: LOC_Os03g04140 (OsSAT3), Os4: LOC_Os03g08660 (OsSAT2;1), Os5: LOC_Os03g10050 (OsSAT2;2), Os6: LOC_Os05g45710 (OsSAT1;3), Gm: Glycine max (PDB code: 4N69)
|
Predicted active sites of OsSATs
The identification of catalytic residues of enzymes is an indispensable step for understanding the functions of enzymes [39] . In this study, active site predictions of OsSATs were performed using InterPro 74.0 server (Table 7). Particularly, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) residues were conserved at different positions in all OsSATs; in contrast, some residues such as 248M (Met), 249Q (Gln), and 292A (Ala) residues were only identified in LOC_Os03g04140 (OsSAT3) protein, suggesting functional divergence of SAT3 protein in rice. In general perspective, it was found that similar amino acid residues were present in the predicted active binding sites.
Table 7
Predicted active sites of OsSAT proteins
Protein name
|
Active sites
|
OsSAT1;1
(LOC_Os01g52260)
|
189D, 190H, 209L, 210H, 216G, 217T, 224R, 225H, 236A, 251K, 253G, 254A, 257L, 259L, 267T, 272P
|
OsSAT1;2 (LOC_Os02g10830)
|
182D, 183H, 202L, 203H, 209G, 210T, 217R, 218H, 229A, 244K, 246G, 247A, 250L, 252L, 260T, 265P
|
LOC_Os03g04140 (OsSAT3)
|
228D, 229H, 248M, 249Q, 255G, 256T, 263R, 264H, 275A, 290M, 292A, 293A, 296L, 298L, 306M, 311P
|
OsSAT2;1 (LOC_Os03g08660)
|
197D, 198H, 217L, 218H, 224G, 225T, 232R, 233H, 244A, 259E, 261G, 262A, 265I, 267L, 275T
|
OsSAT2;2 (LOC_Os03g10050)
|
203D, 204H, 223L, 224H, 230G, 231T, 238R, 239H, 250A, 265K, 267G, 268A, 271V, 273L, 281T, 286P
|
OsSAT1;3 LOC_Os05g45710
|
195D, 196H, 215L, 216H, 222G, 223T, 230R, 231H, 242A, 257K, 259G, 260A, 263V, 265L, 273T, 278P
|
The expression of OsSATs under salt stress
In this study, OsSATs responses under 3, 12, 24-h salt treatments were investigated (Fig. 9). OsSAT2;1, OsSAT2;2 and OsSAT3 were downregulated under all exposure times. The magnitude of OsSAT2;1 expression to all salt treatments was the lowest compared to other OsSATs. On the other hand, OsSAT2;2 and OsSAT3 responded to exposure times in a similar way. The expressions of OsSAT1;1 and OsSAT1;2 and OsSAT1;3 increased depending on the 3, 12, and 24-hour NaCl treatments. OsSAT1;2 was generally expressed at the highest level under all exposure times. Lastly, the responses of OsSAT1;2 and OsSAT1;3 to salt exposure times were the highest at 24-h NaCl treatment. Overall, OsSAT1;1 and OsSAT1;2 and OsSAT1;3 are responsive genes to different salt exposure times; and OsSAT1;2 and OsSAT1;3 were particularly upregulated by 24-hour salt treatment.