Peanut is an economically significant oilseed crop that is widely grown in tropical, subtropical, and warm temperate regions (Bertioli et al. 2016; Stalker 2017). The peanut originated in South America, and has subsequently spread worldwide, adapting to a variety of agricultural ecological environments through phenotypic and genotypic evolution (Pattee and Stalker 1995; Stalker 2017). China is the largest producer and exporter of peanuts globally (Stalker 1995; Stalker 2017; Yu 2011). Based on genotype and phenotype data that obtained from https://arachispheno.peanutbase.org/phenotypes/, we conducted a genome-wide association analysis to identify candidate genes. This analysis will serve as a reference resource for fine mapping and functional analysis of key genes related to PS and PM in the future.
Germplasm resources serve as the pivotal materials essential for genetic breeding investigations in peanuts. The fundamental and crucial aspect of breeding novel peanut varieties is in the utilization and exploration of outstanding traits and genes present in peanut germplasm resources. Advancements in high-throughput sequencing technology have led to a continuous reduction in the cost of procuring molecular markers, consequently expediting the analysis of the genetic foundation of quantitative traits in peanuts (Brown et al. 2021). In a study conducted by Zhang et al. (2017), 158 peanut materials were genotyped using the SLAF-seq technique, resulting in 17,338 high-quality single nucleotide polymorphisms. Subsequently, an analysis of the genetic basis of 11 traits associated with domestication was carried out through GWAS analysis (Zhang et al. 2017). Genotyping was carried out on a total of 195 peanut samples utilizing genotyping-by-sequencing (GBS) technology, followed by a Genome-Wide Association Study (GWAS) analysis focusing on seven traits associated with yield (Wang et al. 2019). In a study by Zou et al. (2020), 384 peanut germplasm resources were genotyped using a 58K SNP array to assess the evolutionary connections among various materials and identifying genetic loci linked to the seed length-width ratio (Zou et al. 2020). Liu and colleagues (2022) conducted a comprehensive analysis by sequencing 203 peanut materials to investigate the genetic underpinnings of seed traits utilizing Genome-Wide Association Studies (GWAS) and transgenic experiments. It is noteworthy that the sequencing depth of each material in the re-sequencing process achieved a remarkable 14.16 times, marking a significant milestone as the highest sequencing depth achieved for peanuts thus far (Liu et al. 2022). In this study, GWAS analysis of peanuts provides a valuable resource reference for future fine mapping and functional analysis of key genes related to peanut pod size in the future.
Previous studies have demonstrated that pod-related traits are significant contributors to peanut yield, and peanut is a quantitatively inherited trait controlled by multiple genes, which are susceptible to environmental influences (Gomes and de Almeida Lopes 2005; Gomez Selvaraj et al. 2009). There is extensive phenotypic variation observed in two pod-related traits, namely pod shattering and days to PM, with broad-sense heritability estimates of 0.49 and 0.81, respectively. Previous studies have indicated that the quantitative trait loci (QTLs) associated with characteristics related to pods in peanuts are predominantly identified using linkage mapping. These QTLs are predominantly located on chromosomes A02, A05, A06, A07, B04, B05, B06, and B07 (Chen et al. 2016; Luo et al. 2018; Chavarro et al. 2020; Gangurde et al. 2020; Zhou et al. 2021). Nevertheless, the utilization of linkage mapping is typically restricted to analyses involving only two parental lines, which leads to constraints in the detection of a wide array of QTLs and genetic variances (Flint-Garcia et al. 2003). In contrast, GWAS utilizes a large number of recombination events in natural populations to scan for markers associated with targeted phenotypic traits across the whole genome, with higher detection accuracy (Visscher et al. 2017). In this study, GWAS using the FarmCPU model identified 19 single nucleotide polymorphisms significantly that were associated with two pod-related traits in peanut.
The identification of potential genomic intervals hinges on the decay length of linkage disequilibrium (LD), a metric that is influenced by a multitude of factors, including species' reproductive behaviors, population size, artificial selection, and other pertinent variables (Li et al. 2018). Furthermore, the precision of Genome-Wide Association Studies (GWAS) is significantly influenced by population size and the density of molecular markers (Atwell et al. 2010; Li et al. 2013). In theory, larger population sizes and higher precision yield more reliable GWAS results. However, in practice, populations cannot expand infinitely. Although the population used for association mapping in this study was not extensive, the application of high-density single nucleotide polymorphisms partially compensates for this study's limitations. Based on the association mapping results of this study, 19 nucleotide polymorphisms associated with legume-related traits were utilized, resulting in the identification of 95 genes, with 56 and 39 genes associated with pod splitting degree and maturity, respectively.
In this study, we found that the majority of the genes associated with PS degree encoded PPR superfamily proteins and transport-related proteins based on the most significant loci, such as F0IT9C, WWV0ES, BX8F04, AEE9K3, PFB0AA, M24FW3 and so on. PPR (pentatricopeptide repeat) proteins constitute a pivotal contingent within the regulatory schema governing the gene expression within plant organelles. They assume a multifaceted role in the post-transcriptional modification of chloroplast and mitochondrial genes. These proteins are adorned with numerous pentapeptide repeats, which converge to form compact β-pleated sheet structures, which are instrumental in mediating interprotein interactions. The purview of PPR proteins is inclusive of a spectrum of functions, including the modulation of post-transcriptional modifications in chloroplast and mitochondrial genes, to participation in RNA splicing and RNA editing events. It is through these functions that PPR proteins exert their profound influence on the growth and development, energy metabolism, and adaptive regulation of plant organelles (An et al. 2023; Deng et al. 2023; Wang et al. 2023a; Wang et al. 2023b). As a large gene family, the PPR protein family has been identified in many species that have been sequenced (Lurin et al. 2004; Ding et al. 2014; Chen et al. 2018a; Chen et al. 2018b; Subburaj et al. 2020; Sugita 2022). In Arabidopsis, eleven AtPPR genes had been shown to respond to biotic or abiotic stresses. The expression of AtPPR96 was induced in responses to salt, abscisic acid (ABA), and oxidative stress, which could alter transcription levels of several stress-responsive genes under abiotic stress treatments (Liu et al. 2016).
For PM, we identified a MYB transcription factor, H5DQEZ. The MYB transcription factors constitute one of the most extensive gene families in plants, and can be categorically delineated into three structural variants, namely 1R, 2R, and 3R. The diversification and classification of the MYB family vary across different plant species. For instance, Arabidopsis thaliana harbors 193 MYB genes, which are distributed across the 1R, 2R, and 3R types; Oryza sativa possesses 172 MYB genes, predominantly categorized into the 1R and 2R types; and Zea mays has 210 MYB genes, which are also primarily segregated into the 1R and 2R classes. In response to environmental adversity, members of the MYB family play a pivotal role in the plant's response to stress by modulating the expression of relevant genes. For example, AtMYB44 and the overexpression of AtMYB20 in Arabidopsis can reduce the expression of the ABA signal negative regulator protein phosphatase 2C (PP2Cs), thereby enhancing the salt tolerance of transgenic plants (Jung et al. 2008; Cui et al. 2013). AtMYBL regulates Arabidopsis response to abiotic stress by controlling leaf senescence (Zhang et al. 2011). The overexpression of red mangrove AmMYB1 in tobacco can alleviate wilting and yellowing of leaves and whole plants under salt stress, thereby enhancing transgenic tobacco's resistance to salt stress (Ganesan et al. 2012). Furthermore, the MYB family is also engaged in processes such as disease resistance and alkaloid synthesis in plants. In summary, the MYB transcription factors serve crucial functions within the plant kingdom (Ku et al. 2020; Wei et al. 2020).