The foundation for crop improvement lies in genetic diversity54,55, which can be assessed by DNA (molecular) markers like SNPs. Analyzing the molecular genetic variation in germplasm provides valuable insights into allelic richness, population structure and diversity parameters. This information helps plant breeders utilize genetic resources more effectively, reducing for extensive pre-breeding tasks when developing new cultivars56. In recent years, due to the advances in next-generation sequencing (NGS), GBS has emerged as a promising genomic approach for estimating plant genetic diversity and population structure on a genome-wide scale, and has been successfully employed, inter alia, in Brassica57, Daucus50,58,59, finger millet60, maize61,62, spruce63, wheat64, and watermelon65. However, GBS has not been used so far for genotyping Peruvian races of maize, whose morphological diversity seems to be the largest worldwide7,12. Studies on Peruvian maize have primarily focused on morpho-agronomic characteristics66–76, leaving their molecular composition largely unexplored. Herein, we determined the gene diversity and composition of Peruvian maize races from the Andean highlands by means of SNP markers spanning each chromosome.
Unfortunately, despite the significant diversity within Peruvian maize germplasm, knowledge of its genetic components remains very limited. Catalán et al77 reported a high level of variability using eight microsatellites in 83 accessions of six races of maize from Cusco. The genetic diversity of the nine races and one subrace of maize assessed in this study is very high, which is concordant with its improvement status (i.e., landraces), as reported for other landraces of beans78, peas79, squash80, tarwi30, or wheat81, among others. Our genetic diversity indices align with other studies on maize landraces. For example, Warburton et al.82 examined the genetic diversity of 24 maize landraces from Mexico with 25 simple sequence repeats (SSR) and reported a total gene diversity of 0.61 across all populations. Similarly, Herrera-Saucedo et al83 determined the genetic variability of 63 native maize accessions from northern Mexico using 31 SSR, reporting an expected heterozygosity of 0.68. A study of 30 maize landrace accessions from the southern Andean region of South America using 22 SSR showed a genetic diversity of 0.7284.
In a more comprehensive study20 employing 96 SSR that encompasses most of the described races in the American continent, 136 accessions of 47 races of Peruvian maize were included, and these together with other maize from Ecuador and Bolivia (total of 235 plants) possessed a total gene diversity of 0.71. Here we determined that the genetic diversity of the Peruvian maize (0.35) from more diverse geographic regions is higher than the value reported for 46 Mexican landraces (161 accessions) of maize using SNPs (0.31185), demonstrating that Peru possesses one of the largest genetic diversities of amylaceous maize, pointing to the fact that the central Andean region possesses abundant maize genetic variability. A recent study86 found that gene diversity of landraces from seven countries from South America, assessed with 23,412 SNPs, was slightly lower (0.323 ± 0.007) than landraces from Central America and Mexico (0.328 ± 0.006). The higher gene diversity reported with SSR compared to SNP markers may be due to the multi-allelic nature and higher level of polymorphism of SSR compared to bi-allelic SNP. However, SNPs are more reliable for inferring genome-wide genetic diversity, as demonstrated by previous work87,88.
Consistent with previous investigations18,82, bred maize is clearly separated from Peruvian maize landraces, which is explained by their intensity of selection. The well-defined grouping of accessions of sub-race Pachia may be explained in the light of its cultivation in a restricted area in southern Peru (Valley of Pachia, Tacna). Although Grobman et al.7 indicated that this sub-race derived from the race Arequipeño which is a lately derived race, our molecular data however does not support this fact, as Pachia was not placed within the CR2 clade. Instead, it is more likely related to race Coruca, which also grows in Tacna and is similar to a floury maize landrace Choclero from Chile7. Further research is needed, including maize samples from other southern Peruvian regions (Arequipa, Moquegua, Puno, Tacna) as landraces of maize cultivated in Tacna show tolerance to high levels of boron74,89, which is a trait of interest for breeding maize for locations with high levels of this element in soil and irrigation water. Landrace Cabanita, widely grown in Arequipa, also shows potential as a source of phenolic compounds with in vitro antioxidant capacity90. Races Cusco Gigante and Cusco tend to group together as they are mainly cultivated in Cusco. Additionally, races Ancashino and Huayleño, both sympatrically distributed in northern Peru (Ancash), group together and possess a very low FST, suggesting they evolved simultaneously. Hybridization likely plays a role in the grouping of these races, as noted by Grobman et al7.
Even though the other Peruvian maize races are morphologically distinct, our GBS dataset failed to support them as monophyletic, which agrees with other research that evaluated Peruvian germplasm20,21,86,91,92. Similarly, Mexican maize races do not form distinct cluster85. However, Caldu-Primo et al.93 were able to distinguish Mexican maize races based on a high FST SNP dataset. Population structure analysis clustered maize races from the American continent based on geographic origin, with the Peruvian germplasm contained within a clade named “Andean”20,21,86,91,92. More consistent clustering was observed when Peruvian races of maize were labelled according to their geographical zones of origin, identifying CZ1 and CZ2. This mixing among maize landraces is likely due to extensive gene flow within these zones explained by their proximities, and frequent seed exchange which is a common practice in the Peruvian Andes. However, Peruvian maize farmers in the Andes usually dynamize seed flow between families and rural communities, and conduct selection within their populations to maintain the morphological characteristics of their landraces. Our ML mostly agrees with the classification of Peruvian maize races based on the chronological origin described by Grobman et al.7 as it was possible to reconstruct the ADPR (CR1) and LDSR (CR2) clades. Even though Chullpi and Paro were considered very closely related7, these ADPR races developed independently from different ancestors. Moreover, the polyphyletic status of race Chullpi reflects a mixed evolutionary origin; that is, more than one ancestor was involved in its development. Thus, the similar traits that race Chullpi exhibits is very likely due to environmental pressures as the climatic gradients of the Andes are laboratories of constant plant evolution. Similarly, the paraphyletic pattern depicted by Cusco Gigante, Cusco, Pisccorunto, San Gerónimo and San Gerónimo Huancavelicano is a result of evolution in the Andes of Peru of a set of novel or derived traits.
A more detailed morphological evaluation is needed for the races of Peruvian maize to identify morphotypes and determine their phenetic plasticity. It is very likely that two accessions of Ancashino (PM-005, PM-012) and one of Paro (PM-165) contributed to the origin of other maize races evaluated in this work, as these individuals form a sister clade to CR1 and CR2. Both races are considered ADPR, directly derived from the primitive races (PR), as described by Grobman et al.7. Therefore, it is very likely these three accessions may still possess genetic signatures of PR. However, further research is necessary, including samples from a wider geographical area, for more conclusive results. The position of purple maize OP cultivars within CR1 is explained by its origin on race Kculli, classified as ADPR by Grobman et al.7. The close relation among purple (OP) and yellow maize (hybrid) is due the use of the latter at UNALM to enhance yield performance of OP lines.
The vast diversity exhibited by maize races in Peru is crucial for research in plant genetic resources95. However, the lack of relevant information on the genetic diversity of conserved plant material hinders the use of accessions preserved in germplasm banks96,97. To address this gap for Peruvian maize, we suggest that genomic tools can facilitate the characterization and utilization of this invaluable plant genetic resources, as emphasized by Mascher et al98. This approach is particularly important considering that amylaceous maize in Peru are landraces, dynamic populations in constant evolution in the Andes. Moreover, maize Peruvian maize germplasm requires a special attention as a ~5500 cal. BP maize cob from northern Peru99 was the only sample without the Zea mays ssp. mexicana ancestry in a recent study100, shedding lights on the origin of maize in the Central Andean region. On the other hand, we expect our study will provide useful guides to researchers and decision makers for establishing a strong conservation strategy and dynamic utilization soon for Peruvian maize germplasm.