In this paper we explored the previously undocumented genetic architecture of partial fertility in cytoplasmic male sterile lines of sorghum. High temperatures around flowering can induce partial fertility in cytoplasmic male sterile lines. Partial fertility in a commercial seed field causes a major problem for commercial seed companies potentially rendering hybrid seed unsaleable. While the genetic control of fertility restoration is well known and controlled by a relatively small number of major genes (Jordan et al. 2010, 2011) the control of partial fertility is poorly understood. In this study, we used GWAS on a panel of 2049 sorghum lines grown in six environments to identify 43 QTL for partial fertility and explore the genetic networks underlying the trait and the likely role of the trait in constraining genetic gain in commercial breeding programs.
A set of 2049 cytoplasmic male sterile lines were grown in six field trials at Emerald QLD such that flowering corresponded with the hottest part of the year, with temperatures with average maximum and average minimum temperatures around the flowering period were 35.4 oC and 21 oC respectively. Freshly flowered segments of heads were rated on 36 separate days generating ~ 20,000 plot x day observations. Considerable variation in partial fertility was observed among CMS lines. However, pairwise correlations of sterility scores across days were high as was the broad sense heritability of the trait across trials (0.8). GWAS conducted on an across site BLUEs of sterility identified 43 significant marker trait associations indicating that partial fertility is a quantitative trait of moderate complexity. One of the potential explanations of partial fertility is that it is caused by sub-functional Rf-like PPR genes. Our evidence indicates this is not likely to be the case and that the situation is more complex. The number of QTL detected for this trait (43) is much larger than the number of known fertility restoration genes and only one of these four major fertility restoration genes, Rf5 (Jordan et al. 2011), located within 2cM of a GWAS hit. Furthermore, the regions surrounding the GWAS hits were not found to be enriched for Rf-like PPR genes as would be expected if partial fertility were the result of sub-functional Rf genes.
Candidate gene analysis suggests a range of gene networks are important in partial fertility
Dhaka et al. (2020) conducted an expression analysis study of different stages of sorghum anthers and combined this with information from rice to identify a list of candidates for engineering male fertility in sorghum. Given that partial fertility occurs in response to environmental conditions at flowering, it seems likely that some genes identified as being differentially expressed in anthers and associated with fertility in rice will be associated with the partial fertility phenotype. A comparison of the location of the significant SNPs from GWAS with the location of the genes identified in Dhaka et al. (2020) identified ~ 25% overlap, with 15 out of the 63 genes co-locating within a 2cM window of the significant SNPs from this study (Table 2). The function of these genes may provide some indications of the gene networks involved in partial fertility.
One of the genes identified by Dhaka et al. (2020) was Sobic.006G159000, an orthologue of rice OsPLIM2b, which is directly implicated in cytoplasmic male sterility in rice. In rice, OsPLIM2b was shown to interact with the cytoplasmic male sterility-related protein kinase, OsNek3, and the transcripts of both genes were found to be preferentially expressed in anthers in bi- to tri-cellular pollen (Fujii et al. 2009). Although OsNek3 was not close to a significant SNP from GWAS in this study, the closely related OsNEK5 was less than 1cM from a significant SNP.
Three candidate genes for sterility, Sobic.003G192200 (rice OsMST8), Sobic.001G099700 (rice OsINV4) and Sobic.004G100900 (rice OsPCBP) implicated in carbohydrate metabolism in the anthers, fall within 2cM of significant SNPs from GWAS. In rice, OsINV4, an anther-specific cell wall acid invertase gene, and OsMST8, an anther-specific monosaccharide transporter, are downregulated by cold, resulting in pollen sterility due to interference in starch storage (Mamun et al. 2006). In addition, OsPCBP is a pollen expressed gene in rice that encodes a calmodulin-binding protein involved in calcium signalling and localized to amyloplasts (Zhang et al. 2012). Transformation experiments indicate that disruption of this gene causes failure of pollen development, likely through disruption of starch accumulation (Zhang et al. 2012).
A further four candidate genes for sterility, based on function in rice and Arabidopsis, Sobic.001G415800 (rice ZEP1), Sobic.004G101500 (rice HEI10), Sobic.002G353500 (rice OIP30) and Sobic.009G012600 (rice RPA1c), are implicated in meiosis and DNA replication. In rice, ZEP1 is critical for controlling crossovers during meiosis (Wang et al. 2010). Its function is closely linked to that of HEI10 whose immunolocalization signals always overlap with ZEP1 signals (Wang et al. 2012). RPA1c is involved in regulating crossover formation and DNA repair in rice. It is one of the sub-units of Replication protein A (RPA), a heterotrimeric protein complex that binds single-stranded DNA. In plants, multiple genes encode the three RPA subunits (RPA1, RPA2 and RPA3), and in combination with the partially sterile RPA1a, RPA1c has been demonstrated to result in sterility in Arabidopsis (Aklilu et al. 2013). Finally, OIP30 is a helicase A class of enzyme that may be a potential substrate for the pollen predominant OsCPK25/26 in rice (Wang et al. 2011).
A further two candidate genes from the gene set identified by Dhaka et al. (2020) that fell within 2cM of a significant hit from GWAS in the current study were related to other physiological functions in the rice anthers. Sobic.003G365600 (rice MID1/ OsARM1) is a transcriptional regulator that promotes rice male development under drought by modulating the expressions of drought-related and anther developmental genes (Wang et al. 2017). Sobic.006G185600 (rice HTH1) is highly expressed in the epidermis of the anther in rice where it is involved in anther cutin biosynthesis and is required for pollen fertility in rice. Its reduced expression results in abortion due to a collapsed anther wall (Xu et al. 2017).
Partial fertility, rather than the frequency of restorer genes, imposes constraints on the genetic diversity of female parents in hybrid breeding programs
The genetic diversity of the female, or A/B line, populations of hybrid breeding programs are low relative to the male or R line parents (Crozier et al. 2020) as demonstrated by the breeding populations in this study. Linkage disequilibrium (LD) decays much more slowly in female parents than male parents from germplasm set 2, and as expected both decay more slowly than in a sample of global sorghum diversity (Tao et al. 2020). In the diversity set R2 declines to zero at ~ 250kb, while at the same distance R2 in the female population is ~ 0.2, double that of the male population (~ 0.1). The extent of LD in a population is the result of the complex interplay of factors such as selection, admixture, linkage and genetic drift. Typically, populations with small effective population size (Ne) experience more genetic drift than larger populations with closely linked loci indicating population sizes over the historical past, while loosely linked loci signify Ne in the immediate past (Hayes et al. 2003; Hill and Robertson 1966). The divergence in LD identified between the parental populations, at both close and loosely linked loci, suggest that Ne has been low in female, B, lines compared with male, R, lines in both the recent and historical past.
It is often stated that the major reason for the low genetic diversity within female parent lines is the fact that most sorghum landraces and germplasm lines are restorers of cytoplasmic male sterility (Menz et al. 2004). However, given that restoration of A1 cytoplasm is under the control of a small number of major genes (Jordan et al. 2010, 2011), it would seem to be relatively easy to remove these genes via phenotypic selection in test crosses or more recently by selection with molecular markers. This conjecture is further strengthened by the observation that the genetic diversity in B line material was not significantly lower than R lines for linkage groups containing restorer genes compared with linkage groups that did not. It seems unlikely therefore that frequency of restorer genes is sufficient to explain the observed differences in genetic diversity between the male and female parental pools. We propose that the difference in genetic diversity between the pools is primarily driven by the genetic architecture of partial fertility. The large number of loci influencing partial fertility identified in this study coupled with their environmental and dosage dependent expression would make selection against partial fertility difficult. At the same time, the financial consequences of fertility breakdown are large, leading commercial breeders to be conservative in their crossing decisions when developing new female parent lines. This in turn has resulted in low diversity and high LD of female parent lines.