Comparative Analysis of Soybean Genetic Diversity Across Geographies
In comparison to cultivated varieties, wild soybeans exhibit a higher degree of genetic diversity (Zhao et al. 2018; Li et al. 2020). Our study also revealed that the genetic diversity of wild soybeans in China surpassed that of both cultivated soybeans in China and Europe. As the original center of soybeans, China boasts abundant soybean resources, and previous research has shown that Chinese local varieties typically possess greater genetic diversity and more rapid LD decay rates (Dong et al. 2004; Li et al. 2020). In contrast, the breeding foundation for soybeans in Europe is relatively weaker due to the shorter history of soybean cultivation and smaller cultivation area (Haupt & Schmid 2020). Tavaud-Pirra et al. (2009) were the first to examine 32 European soybean populations from 1950–2000 and found that the genetic diversity of European soybean germplasm was significantly lower than that of North American and Asian soybean germplasm.
Previous analysis of smaller-scale Chinese and European soybean populations often revealed clearer population structures (Saleem et al. 2021; Yao et al. 2023). However, our study found significant gene flow between Chinese and European soybean germplasms (Fig. 2B, C), making it challenging to separate them using conventional population structure analysis. Liu et al. (2020) provided a thorough overview of the global spread of soybeans originating from China. However, their study was constrained by the limited variety of European soybean germplasm available, leading them to use soybean varieties from southern Sweden as a proxy for European germplasm. Despite these limitations, they asserted that the European soybean lineage can be traced back to Northeast China and North America, with the North American soybean germplasm itself having roots in Northeast China (Gizlice et al. 1994). In contrast, our research utilized a broader spectrum of European germplasm, comprising 797 soybean varieties preserved by the Chinese National Soybean GeneBank (CNSGB). This extensive collection facilitated a more comprehensive comparison between the soybean germplasms of Northeast China and Europe. Our findings indicate that certain European soybean germplasms exhibit minimal genetic differences when compared to the Northeast Chinese soybean germplasm, thereby providing molecular evidence supporting the Northeast Chinese ancestry of some European soybeans. This insight enhances our understanding of the genetic continuity and diversity resulting from decades of breeding work, underscoring the close genetic ties between European and Northeast Chinese soybeans.
Selective Sweeps in CN and EU collections
During the domestication of soybean, some genes associated with certain traits were selected and fixed, resulting in a significant reduction in genetic diversity in the affected regions (Goettel et al. 2022; Qin et al. 2023). Selective sweep analysis can be used to locate these selected regions, and in combination with previously identified loci, can help infer differences in selection pressures between populations or aid in the discovery of new genes (Wen et al. 2015; Kim et al. 2021). In this study, we used FST, pi, and XP-CLR to perform selective sweep analysis on Chinese Northeast and European soybean populations, which minimized the occurrence of false positives and facilitated the mapping and genetic analysis of the identified regions.
Building on the foundation laid by genome-wide association studies and QTL mapping, our investigation delves into the genetic underpinnings of soybean adaptation to diverse climatic conditions. Several QTLs related to flowering and maturity were identified through a selective sweep, echoing the discoveries in Saleem's study (Saleem et al. 2021). Specifically, our research unveiled a region with pronounced selection signals in the vicinity of loci E4, in addition to a robust selection region in the E1 loci. This observation aligns with the introduction of early-maturing European soybeans into our study, highlighting the genetic influence of breeding efforts aimed at adapting to different climatic conditions. Furthermore, we identified a strong selection signal adjacent to PRR genes, notable for their functional significance in the PRR3/7 subclade, particularly in response to long-day conditions (Lu et al. 2020). These findings underscore the profound impact of domestication and breeding practices on the genetic variation observed within soybean populations across China and Europe, as they have been tailored to thrive under the distinct latitudinal challenges presented by their respective environments.
Seed protein content is one of the most important breeding objectives for soybeans (Singer et al. 2023). In our study, we identified six protein-related QTLs that have undergone selection in Chinese germplasm, yet none were identified within European germplasm. Despite this, preliminary findings indicate that European varieties generally possess higher protein content. This discrepancy leads us to suspect that there are additional protein-related QTLs and genes in European soybeans that remain undiscovered. This possibility highlights the need for further genetic exploration to fully understand and utilize the genetic potential of European soybean varieties for protein content enhancement. In addition, seven QTLs related to seed amino acid content were identified in Chinese germplasm, indicating that Chinese Northeast soybeans may have undergone specific selection for some amino acid content.
The Northeast region of China is renowned as a major hub for high-oil soybean production. Breeding efforts in this area have been focused on enhancing the oil composition of soybeans towards healthier profiles, a trend also reflected in recent European varieties. Our research has identified several oil-related Quantitative Trait Loci (QTLs) in this region, emphasizing its importance in soybean oil production (Song et al. 2023).
Soybean oil typically comprises 21.5% oleic acid, a beneficial unsaturated fatty acid, and 12.2% palmitic acid, a saturated fatty acid which, while stabilizing the oil, can be detrimental to health if consumed excessively (Abdelghany et al. 2020; Julibert et al. 2019). Given the health implications, increasing the proportion of unsaturated fatty acids is a significant breeding target to enhance the nutritional value of soybean oil (Thelen & Ohlrogge 2002). In line with these objectives, our study identified two QTLs related to seed oleic acid content in Chinese germplasm. Furthermore, we discovered an additional QTL in newly introduced European germplasm, indicating potential genetic avenues for improving oil quality in both regions. This cross-regional genetic investigation could further enhance the quality of soybeans by informing targeted breeding strategies.
We also identified some QTL related to nutrient use efficiency and accumulation. The regulation mechanism of nutrient accumulation in soybean tissues is very complex and may vary depending on the soil and environmental conditions in different regions, leading to different selection directions (Ray et al. 2015; Dhanapal et al. 2018).
ZDX1 soybean SNP array
The ZDX1 soybean SNP array was developed using the largest scale of soybean core germplasm to date, encompassing 2214 soybean accessions from China and worldwide. This revolutionary array is capable of accurately identifying soybean germplasm from both China and abroad. When compared to other soybean array like SoySNP50K, 180 K AXIOM®, and NJAU 355 K SoySNP, the ZDX1 array boasts 80% unique SNPs, providing greater coverage of the soybean genome (Sun et al. 2022b). Moreover, its functional sites can efficiently and precisely identify crucial agronomic traits.
Soybean cyst nematode is a widespread issue that continues to affect soybean production regions across the world, and it is rapidly expanding to other areas (Tylka & Marett 2017). In this study, using the ZDX1 array to examine functional loci, we discovered 24 European varieties with resistance to soybean cyst nematode (Table S3). However, we also observed that the genetic diversity and quantity of resistance to soybean cyst nematode may be inadequate in European germplasm. This inadequacy may hinder the future expansion of soybean cultivation in Europe.