Related QTLs have been reported for arabinose, xylose, mannose, rhamnose, galactose, and fucose in different cereals (Hazen et al. 2003; Stombaugh et al. 2004; Zhang et al. 2012b; Serba et al. 2016; Xu et al. 2017; Warwick Vesztrocy et al. 2018). Since no GWAS has been performed on monosaccharide contents in rice grains, we were motivated to address this gap in the literature, to pave the way for future functional analyses. Association mapping is an effective method for identifying genomic regions that control quantitative traits based on linkage disequilibrium (LD) (Alqudah et al. 2020). In the association mapping method, the resolution and accuracy of the obtained map depend on the extent and the amount of LD decline (Kim et al. 2007; Bastien et al. 2014a, b). LD measures in a mapping population also present an estimation of the number of required markers for the identification of QTLs (Kim et al. 2007; Bastien et al. 2014a, b). The LD decay in our population was very slow, dropping to 0.2 in around ~ 570 kb (~ 2 cM) and to 0.1 in around 1.65 Mbp (5 cM <) (data not shown). Bioinformatic analysis of trait-associated SNPs is considered a method of choice in identifying candidate genes for multiple complex traits (Liu et al. 2017). Below the association of candidate genes containing the identified markers for each monosaccharide is being discussed.
Arabinose is produced from UDP-Xylose by a UDP-Xyl 4-epimerase (Kotake et al. 2016; Supplementary figure S2). Rhamnogalacturonan I, arabinan, (glucurono)arabinoxylan, xyloglucan (Anders et al. 2012), and arabinogalactan-proteins (Tryfona et al. 2010), all carry arabinose in their structure. Previous studies have shown that a large number of GTs such as GT47, GT64, GT8, GT92 along with methyltransferases and acetyltransferases are involved in pectin biosynthesis (Takenaka et al. 2018). Members of GT families 1, 47, 61, 95 and 77 are involved in the transfer of arabinosyl moieties to acceptor molecules in plants (Harholt et al. 2012; Møller et al. 2017). The arabinose content in our study had an average of 15 µg/mg with heritability (hb2) of 0.724. The candidate genes shown to have association with arabinose content were mostly involved in metabolism, catabolism, and phosphoryaltion of carbohydrates and cell wall components (Wu et al., 2016; Marzin et al., 2016; Mouyna et al., 2016). Overexpression of an α-arabinofuranosidase has shown reduction of arabinose and increase in glucose contents for more cellulose accumulation and greater saccharification (Sumiyoshi et al. 2013). Thus, this associated enzyme seems to have a defining role in maintenance of the cell wall network via providing a balance between arabinoxylan and cellulose. In an Arabidopsis line overexpressing pectin methylesterase inhibitor (PMEI), another gene that we show is associated with arabinose content, the amounts of neutral sugars including Arabinose remained unchanged in seeds (Muller et al. 2013). Mutant analysis of an Arabidopsis seed-coat specific PMEI (pmei6) showed pronounced reduction of mucilage (Saez-Aguayo et al. 2013) that is made up of rhamnogalacturonan I decorated with arabinan and galactan modifications (Macquet et al. 2007; Arsovski et al. 2009). PMEI was shown to affect the arabinose content in water-soluble mucilage of Arabidopsis seed coat (Saez-Aguayo et al. 2013). The rice genome encodes 49 PMEIs and their functions have started to be unraveled in recent years (Nguyen et al. 2017). In contrast, overexpression of Aspergillusnidulans pectin methylesterase (AnPME) in Arabidopsisthaliana showed significant increase in arabinose content with significantly less galacturonic acid (Reem et al. 2020). A line of rice over-expressing PMEIs (OX-OsPMEI28) was shown to have moderate reduction in arabinose next to glucose and xylose in culm tissue, while the rest of neutral sugars remained unchanged (Nguyen et al. 2017). Another candidate gene for arabinose content is glucosamine-fructose-6-phosphate aminotransferase 1 (LOC_Os11g03900, Table 1). Glucosamine-fructose-6-phosphate aminotransferase (GFAT) catalyzes the formation of glucosamine 6-phosphate and is the first and limiting enzyme in the biosynthesis pathway of hexosamine (Yuzwa and Vocadlo, 2014) and this pathway plays important roles in development and growth in plants (Denzel et al. 2014). It has recently been shown that mutations in this gene have been shown to reduce pectin and callus in the Arabidopsis cell wall (Vu et al. 2019). The role of this gene in the metabolism of arabinose has not been studied, but considering that it has been reported that this gene affects the amount of pectin in the cell wall (Vu et al. 2019) and due to Rhamnogalacturonan I in pectin carries arabinose (Anders et al. 2012), it can be assumed that this gene affects the amount of pectin by affecting the amount of arabinose.
Xylose is a pentose that is often used as a food browning agent (Can and Yilmaz 2002). It can chemically be reduced to xylitol (Trivedi et al. 2020), a valuable food additive (Chandel et al. 2018). Xylose is mostly found in xylan polysaccharides. The xylose content in our rice varieties was 18 µg/mg on average with hb2 = 0.89. It has been shown that elongation of the xylan backbone is dependent on the co-operative actions of GT43 (irregular xylem 9 (IRX9), I9H/IRX9-L, IRX14 and I14H/IRX14-L) and GT47 (IRX10 and IRX10-L) enzymes (Jensen et al. 2014, Urbanowicz et al. 2014). A candidate gene found for xylose in our study was a kinase (LOC_Os09g22410, Table 1). Recently, it was shown that a kinase can alter the distribution of hetroxylans and MLGs in elongating sorghum internodes (Oliver et al. 2021).
Mannose is a naturally abundant monosaccharide found in mannan polysaccharides (a minor component in cereal cell walls) and glycoproteins (Kranjčec et al. 2014). Phosphomannose isomerase, phosphomannomutase and GDP-mannose pyrophosphorylase are involved in mannose synthesis (Pourceau et al. 2009). Mannose can be used as a dietary supplement to prevent urinary tract infection (Kranjčec et al. 2014). A reduced form of mannose known as mannitol (Bhatt et al. 2013), is an osmolyte or metabolic store and a powerful scavenger of reactive oxygen species (Meena et al. 2015). Mannose content was 6 µg/mg on average in RWGs, hb2 = 0.81. One of the candidate genes for mannose content was LOC_Os09g02410, S-domain receptor-like kinase (SDRLK; Table 1), which is originally found in self-incompatability responses in Brassica (Zou et al. 2015). Rice has 144 members of SDRLK with varieties of functions in biotic/abiotic stress responses and development (Chen et al. 2013; Fan et al. 2018; Naithani et al. 2021). Transgenic rice plants carrying anti-sense for a member of SDRLK (OsESG1) developed into fewer crown roots and shorter shoots at seedling stage compared to wild type (Pan et al. 2020). In contrast, overexpression of another truncated SDRLK homologue improved plant height and yield components, including primary branches per panicle and grains per primary branch and eventually total yield (Zou et al. 2015). Despite these recent attempts for understanding the function of SDRLKs, no reports have been made towards their possible roles in monosaccharide contents, cell wall structure, and grain composition. When lowering the significance threshold [-log10 (p) < 4)], marginal SNPs were detected to be associated with mannose content (data not shown). One is id3015638 on chromosome 3 (at position 32470502 bp). A candidate gene near this SNP could be marked probably associated to mannose content, namely Arabinogalactan lysine-rich protein (AGP 19). The cell wall-associated glycopeptides (AGP 17, 18, 19; glycosylphosphatidylinositol (GPI)-plasma membrane anchored proteins) have been shown to be involved in sextual reproduction in flowering plants and as a wall plasticizer for cell expansion, growth and development (Zhang et al. 2011; Verdugo-Perales et al. 2018). In Arabidopsis, agp19 mutants showed fewer siliques and less seed production (Yang et al. 2007). Although, the reason for lesser seed and silique numbers compared to wild type was not clear at the time of publication, we speculate that the monosaccharide homeostasis between source and sink might be the possible reason. More detailed and directed analyses need to be conducted in mutants and overexpressed lines for monosaccharide contents of seeds to describe any relevance of AGPs to these traits.
Galactose is mainly found in arabinogalactan proteins and pectin. Analysis of our rice population showed galactose content to have an average value of 5 µg/mg in RWG with hb2 = 0.92. Candidate genes identified as being associated with galactose content were mainly involved in carbohydrate metabolism. GH16 (Sakamoto and Ishimaru, 2013; Kalomoiri et al. 2019), was among the candidate genes for galactose content. GH16 family members target AGPs possibly during the course of germination, but currently no such relation has been established. The other candidate gene was a GRAS family of plant transcription factor (LOC_Os03g51330) that mediate gibberellin (GA) signaling as a GA-depressible repressor; abolishing α-amylase activity in rice aleurone (Fu et al. 2001). The role of GRAS family members in defining monosaccharide grain contents and more specifically galactose is missing. Glutathione S-transferase (GST, LOC_Os01g27340) was the other candidate gene in our study for galactose content. During cereal seeds development and germination, they undergo an orchestrated disappearance of cells via programmed cell death (PCD; for details see Domínguez and Cejudo, 2014). Expression of GST among some other genes, including AGPs and extensins, in addition to their role in wall compositional characteristics, are hypothesized as specific features of PCD (Gao and Showalter, 1999; Betekhtin et al. 2018). We speculate that during PCD and via an indirect process GST may have an effect on AGP, and therefore it can change the rice grain galactose content; something that needs to be investigated latter. LOC_Os03g24510, Glycosyltransferase family 8 (GT8), that contains galacturonosyltransferase, galacturonosyltransferase-like, GATL-related, galactinol synthase, and plant glycogenin-like starch initiation proteins A (PGSIP-A, B, C), was the other candidate gene for galactose content. Details of GT8 family members functions have not been revealed in cereals. However, they have been found to be cell wall-related and stress-inducible via RNA-seq and microarray analysis (Kong et al., 2019). With lowering significance threshold and considering marginal associated SNPs (data not shown) we could recognize a Polycomb group (PcG) of proteins as candidate gene to galactose content on chromosome 8 (tagged to SNP marker ud8001306 at position 15267417 bp); they are involved in regulation of different aspects of plant growth and development (Liu et al. 2016; Paul et al. 2020). Comparative transcriptome profiling between a rice mutant for PcG (osfie2-1) and wild type determined differentially expressed genes including endosperm starch synthesis and cell division/expansion among others (Liu et al. 2016). Both traits are indicative of the role of PcG on carbohydrate metabolism and possibly cell wall assembly. Similarly, rice knock out mutants and overexpressed lines were suggestive of the roles of two other PcGs in carbon metabolism, early seed development, seed size and quality in rice (Paul et al. 2020). Furthermore, another marginal SNP was detected by GWAS to which were tagged wall-associated kinases (WAKs) on chromosome 1 (id1009852 at position 14905421 bp). Previously and via computational optical-sectioning microscopy, it has been shown that AGP and WAK have regulatory role in a place called plasmalemmal reticulum, the interface between cytoplasm and cell wall (Gens et al. 2000). AGP with its polyhedral arrays and WAKs by presenting interconnection between membrane and cell wall tighten the structure to withstand mechanical stresses (Gens et al. 2000). It has been reported that WAKs are involved in cell elongation and sugar metabolism modulation (Kohorn et al. 2006). However, for the latter the exact mechanism in the grain is lacking.
Fucose is the most popular hexose in the food industry (Shintani 2019). This monosaccharide is a component of sucrose and is highly sweet, leading to applications in the food and beverage industries with health benefits due to low glycemic response (Gueniche and Castiel-Higounenc 2017). Our data showed that rice grains contain fucose with an average of 14 μg/mg (hb2 = 0.97). Previously, gibberellin 20 oxidases, a candidate gene for fucose content in this study (LOC_Os03g63970), were reported as corresponding QTLs for seed dormancy and pre-harvest sprouting in rice, wheat and barley (Li et al., 2004). In a study beyond seeds, overexpression of the homologous gene from Pinusdensiflora in a hybrid polar and Arabidopsis led to significant increase in glucose and xylose contents, but not other monosaccahrides (Jeon et al. 2016). In transgenic line of switchgrass (Panicumvirgatum L.) overexpressing Knotted1-like transcription factor, increased expression of gibberellin 20 oxidase was noted; suggesting regulatory effect of GA on the transcription factor (Wuddineh et al. 2016). In the transgenic line, the expression of cellulose and hemicellulose biosynthetic genes were altered and release of sugars improved (by 13% on average for glucose and xylose) towards improving lignocellulosic feedstock compared to the wild type plant. Till now, there is no report on the effect of gibberellin 20 oxidases on fucose content.
Rhamnose, an important constituent of pectic polysaccharides (mainly rhamnogalacturonans), is a deoxy monosaccharide that is widely distributed in bacteria and plants, but is rare in animals (Wang et al. 2009; Jiang et al. 2020). In plants, rhamnose biosynthesis results from the catalytic action of rhamnose synthase using UDP-glucose as a substrate (Jiang et al. 2020). Palmer et al. (2015) showed the presence of pectin in rice grains via immunofluorescence microscopy, here and for the first time in cereals, we have showed that RWG has an average of 3 μg/mg of rhamnose with hb2 = 0.95. Rice has 58 small auxin-up RNAs (SAURs), a candidate gene of rhamnose content (LOC_Os03g18050, Table 1), which are the early auxin-responsive genes with negative regulatory effect on auxin synthesis and transport (Jain et al. 2006; Kant et al. 2009). Constitutive induction of a rice homologue caused growth and seed yield reduction due to lower auxins, and increased sugar contents among some other changes (Kant et al. 2009). However, no relations with specific monosaccharide contents have been established as yet. The other candidate gene to rhamnose content was an invertase (LOC_Os02g34560, Supplementary Table S3). Invertases function in hydrolysis of sucrose and the resulting hexoses later can be sensed by sugar transporters to define monosaccharide utilization to direct growth and development (Sherson et al. 2003). In a transcript analysis of hybrid aspen, an invertase was among the genes with significantly high expression values with possible association to lignocellulose content in developing xylem tissue (Nakahama et al. 2018). The other candidate gene in our study was a COBRA-like protein, LOC_Os07g41320 (COBL; Table 1). In Arabidopsis, analysis of seed coat epidermal cells (also known as mucilage secretory cells) have shown the role of COBL to be involved in mucilage (rhamnogalacturonan I) deposition in addition to their function in crystalline cellulose deposition (Ben-Tov et al. 2015, 2018). Although little to none is known about rice seed mucilage, there are indications of the presence of pectin in grains (Palmer et al. 2015). Therefore, more compositional/structural investigations in addition to functional analysis of COBL in rice seems to be required in rice grain. Interestingly, CesA5 was among the co-expressors of the candidate genes involved in rhamnose content (Table 1). CesA5, next to COBL, is among the members of a class of proteins that are reported to be involved in redistribution of the pectic components of seed mucilage (Harpaz‐Saad et al. 2011; Mendu et al. 2011; Ben-Tov et al. 2018). In a mutant analysis approach for cobl2, cesa5, and their double mutants (cobl2cesa5), their role in mucilage ray length became evident (Ben-Tov et al., 2018). Another putative candidate gene to rhamnose content is Squamosa promoter-binding-like protein 11 (LOC_Os06g45310, Table 1). The SQUAMOSA PROMOTER BINDING PROTEIN LIKE box (SPL) gene family represents one of the plant-specific zinc finger protein genes, which encode putative plant-specific transcription factor (Li et al. 2020; Agarwal and Lahiri 2020). SPLs play a role in grain quantity and yield, floral period and stress resistance (Tong et al. 2019). So far, 19 OsSPLs have been identified in rice (Yang et al., 2008). ZmSPL11 has been shown to play a role in growth and development in maize grains (Wang et al. 2005). In Arabidopsis, this gene controls the morphological changes associated with root maturation and the reproductive phase (Shikata et al. 2009). SPL11 in rice is involved in the regulation of cell death and flowering (Liu et al. 2015). Another study has shown that mutations in SPL11 increase disease resistance and increase plant resistance to reactive oxygen species (ROS) (Kojo et al. 2006). The relationship of this gene with the amount of monosaccharides in the grain has not been determined. However, due to the effect of this gene on grain yield (Li et al. 2020), it is likely that this transcription factor affects the amount of rhamnose in the grain.