Effective use of bottle gourd genetic resources for cultivar obtention and for conservation requires development of genomic tools for marker-assisted breeding. During the last decade, significant progress has been made in the development of genomic resources in bottle gourd. Most of these genomic resources provide valuable information about genetic relationships among genotypes for effective selection and use in breeding programs (Xu et al. 2014; Wu et al. 2017; Wang et al. 2018). Despite significant progress, there are generally very limited genomic resources developed for bottle gourd limiting breeding efforts to develop competitive genotypes for agricultural production and in the nutraceutical and pharmaceutical industries. The present study identified SNPs molecular markers distributed across 11 chromosomes of bottle gourd employing genotyping-by-sequencing platform which were then used to determine genetic relationships and population structure in a collection of bottle gourd accessions of African, Asian, and South American origins, and subsequently identified SSR loci from the GBS sequences. Among the high throughput sequencing technologies, GBS is considered the most cost-effective tool to identify and genotype a large number of polymorphisms at genome-scale (Wu et al. 2017). Here, we used Elshire-GBS method and Lagenaria siceraria var. USVL1VR-Ls as reference genome which resulted in a set of 12,766 filtered SNPs markers. A recent GBS study used to confirm the varietal status of bottle gourd accessions produced 22,575 SNPs (Konan et al. 2020), which was higher than the present study. Others high throughput studies conducted in bottle gourd used the Restriction site-associated DNA sequencing (RAD-Seq), a form of GBS that generate low coverage genome sequencing in which reference genomes are not available (Xu et al. 2014; Wu et al. 2017). In addition, Wu et al. (2017) using RAD-Seq and aligning to the Hangzhou gourd reference genome detected 19,226 SNPs, similar with the present findings. On the contrary, Xu et al. (2014) using RAD-Seq genotyping identified 3,226 SNPs and Xu et al. (2011) using partial sequencing only discovered 3,913 putative SNPs. These differences between the current study and previous results may be due to high read depth variation of RAD-Seq or the high levels of missing data of Elshire-GBS (Scheben et al. 2017) and the average coverage which typically varies between these reduced-representation sequencing methods. For instance, while RAD-seq involves sequencing fragments to moderate coverage between 5x and 15x (Fountain et al. 2016), Elshire-GBS studies tend to reach low coverage of ~ 1x (Swarts et al. 2014). Despite these differences, the generated SNPs markers and SSR loci are a useful genomic resource for genetic analysis and breeding in bottle gourd for diverse applications, however, in subsequent studies, a final set of SSR loci should be developed and validated before being used in diverse bottle gourd accessions collected from different regions of the world.
For instance, in this study, the most abundant class of SSRs identified from GBS sequences was comprised by dinucleotide and trinucleotide repeats. Similar results have been reported previously for bottle gourd. Xu et al. (2011), for example, identified that dinucleotide and trinucleotide repeats were the most abundant, while mononucleotide and pentanucleotide repeats were relatively rare. Moreover, the high frequency of dinucleotide and trinucleotide repeats is consistent with other cucurbit species, including cucumber and watermelon (Ren et al. 2009; Zhu et al. 2016b). Furthermore, similar to our results, the AT-rich motifs have been the predominant motif in all nucleotide repeats in melon, watermelon, cucumber, and bottle gourd genomes (Ren et al. 2009; Zhu et al. 2016a; Zhu et al. 2016b).
In a breeding program, the extent of genetic diversity and population relationships among the germplasm is useful to identify distantly related parents for hybridization to develop genetically improved genotypes of bottle gourd for rootstocks, food, feed and medicinal purposes. For this reason, in different regions, several studies have been conducted to determine the genetic diversity of bottle gourd accessions (Gürcan et al. 2015; Mashilo et al. 2016b; Ibrahim, 2021). In this study, the accessions of bottle gourd were collected from Chile, Japan (Philippines, South Korea), and South Africa. Most of the Asian accessions share similar genetic background to South African accessions which been previously assayed using SSR markers (Mashilo et al. 2017a). In the current study, various genetic parameters were estimated using SNPs markers including Ho, He and PIC values with mean values of 0.18, 0.16 and 0.29, respectively. Gürcan et al. (2015), genotyped thirty-one bottle gourd accessions from USA, India, Nigeria and Russia using SSR markers and reported mean values of 0.50, 0.13, and 0.50 for He, Ho and PIC, in that order. Also, Mashilo et al. (2016b) using SSR markers reported high average values for He = 0.657 and PIC = 0.57 among bottle gourd accessions, higher than values reported in the present study. Botstein et al. (1980) classified the PIC values in to three categories (1) if the PIC value of the marker is more than 0.5, the marker is considered a highly informative, (2) if the PIC value ranged from 0.25 to 0.5, the marker is a moderately informative, and (3) if the PIC value less than 0.25, then the marker is slightly informative. Based on Botstein classification, SNPs markers generated in the present study are moderately informative. A recent study indicated that PIC values calculated with SNPs markers showed lowest values compared to SSR markers (Singh et al. 2013; Liu et al. 2017). This can be attributed to the bi-allelic nature of the SNPs which is restricted to PIC values ranging from 0.0 to 0.5 (i.e., when the two alleles have identical frequencies), whereas for SSR markers which are multi-allelic PIC value can vary between 0.5 and 1.0 (Singh et al. 2013; Eltaher et al. 2018).
Expected heterozygosity is usually preferred to assess genetic diversity, because it is less sensitive to the sample size than the observed heterozygosity (Chesnokov and Artemyeva, 2015). According to Chesnokov and Artemyeva (2015), when Ho and He are similar (i.e., not significantly different), the crossing in the population is almost accidental. When Ho < He, it is an inbred population, and when Ho > He, the random mating system dominates inbreeding in the population. Our results showed that Ho was slightly higher than He, suggesting that random mating system dominates inbreeding in the assessed bottle gourd germplasm. Moreover, population differentiation indicated a higher variation within sample, a common characteristic of cross-pollinated plants which can reduce the loss of genetic diversity through large gene flow. As proposed by Mashilo et al. (2016b), this could be attributed to the high out-crossing nature of bottle gourd or long-term selection of the crop by farmers for diverse uses.
Population structure and genetic relatedness are useful to understand genetic diversity, differentiate the population according to their geographical origin and conduct association mapping studies. Based on population structure analysis, two genetically differentiated groups were identified; the first including all the accessions originated from South Africa and the second group comprising of Asian and Chilean accessions. These results agree with previous studies conducted in bottle gourd, which reported that clustering of different landraces was independent of geographical location (Yetişir et al. 2008; Sarao et al. 2014; Gürcan et al. 2015; Mashilo et al. 2016b). Another explanation is that founder effect followed by artificial selection based on fruit shape which tend to generate high genetic similarity (Xu et al. 2011; Yildiz et al. 2015). In crop improvement programs, germplasm collection missions should be based on morphological variation rather than geographical origin (Mashilo et al. 2016b). Heiser (1973) classified bottle gourd into two subspecies: Asian and American-African subspecies. These authors postulated that African wild bottle gourd floated to the shores of America and were independently domesticated there. Using various molecular markers, different results on the phenomenon have been reported. For example, Erickson et al. (2005) using SNP markers within chloroplast DNA concluded that American bottle gourds were more closely related to Asian than to African gourds, whereas Decker-Walters and Wilkins-Ellert, (2004) by using RAPD molecular markers revealed that American germplasm is distinct and primarily originated from Africa but possesses Asian genetic profiles. Similar with Erickson et al. (2005), our results supported the idea that one group is only composed by genotypes of South Africa, and the other correspond to an admixture group with genotypes from Asia and South America.