KASP marker development
Twenty-three broccoli lines were used for the next generation sequencing, with average sequence depth of 28.0×. A total of 346.188G raw data were initially obtained, and after quality filtering and adapter trimming, 345.577G clean data were available for further processing (Additional file 1: Table S1). A total of 2303.9 million paired reads (each line ranges from 87.7 to 124.9 million) were generated, and 2272.9 million (98.7%) paired reads were successfully aligned to the broccoli reference genome [23] (Additional file 1: Table S2). As a result, millions of SNP markers were detected in each line, with the number from 899,926 to 1,908,908 (Additional file 1: Table S3).
A total of 28,220 SNPs were selected after filtering out the SNPs with high missing data (> 20%), low levels of MAF (< 0.05) and the SNPs with multiple variation before or after 50bp of the loci. The PIC values of all these SNPs were calculated, and 13,621 SNPs with the PIC between 0.2 and 0.5 were retained. Of these, 8,768 (64.4%) SNPs could be successfully designed as KASP markers. According to the physical position of the SNPs and the genomic structural annotation, 2,515 SNPs located the exonic (including non-synonymous, stop gain or stop loss), intergenic, upstream or downstream of the functional genes. These SNPs may effected some important agronomic traits that could be used for molecular breeding. Ultimately, 500 SNPs that uniformly distributed across nine chromosomes were selected to develop KASP markers.
Genotyping of 392 broccoli accessions
To evaluate the quality of the KASP markers, 23 broccoli lines used for next generation-sequence (NGS) were genotyped by the selected 500 KASP markers. As a result, 347 (69.4%) KASP markers could be genotyped successfully, that is, the genotype for most of the accessions was clear (Fig. 1A). While 54 KASPs presented inconformity genotype between NGS and KASP assay, including some KASPs with no polymorphism (Fig. 1B). The remaining 293 KASPs were further used for genotyping the 392 broccoli accessions. To make the results reliable, markers showed more than 10% missing values or had ambiguous SNP calling were removed (Fig. 1C), and 100 markers with high quality and evenly distribution were used for further analysis (Fig. 2; Additional file 2)
In all 392 accessions, the heterozygous marker ratio was 5.5%- 87.2%, and 387 accessions (98.7%) were labeled with more than 30% heterozygous marker ratio, indicating that most of these accessions were heterozygous (Fig. 3A). The MAF values of all KASPs were 0.11- 0.48, with an average of 0.32. Most of the KASPs (88%) have the MAF values more than 0.2. The PIC values of all KASPs were 0.20- 0.50, and the percentage of PIC values between 0.45 and 0.50 was relatively high (47 %) (Fig. 3B; Additional file 2)
Accession assessment, genetic diversity and population evaluation
Based on the genotyping results of 100 KASP markers, the phylogenetic tree was constructed by Neighbor-joining method using MEGA software version 10.0.4, with 1 000 bootstrap replications. All 392 broccoli accessions were clustered into three groups (Fig. 4). The cluster I contained 103 accessions, and most of them were improved accessions. The cluster II contained eight accessions, which were introduced from Japan, including bck2, bck3, bck5, bck6 and so on. The cluster III is the major group that contains 281 accessions, and most of them were initial accessions. Most of the improved or introduced accessions presented strong growth potential, high-round flower head with thin and uniform size buds. While the initial accessions were faded in some characteristics.
Principal component analysis (PCA) indicated that cluster II (introduced accessions) was contained within cluster III (initial accessions), and cluster I (improved accessions) was overlapped with the other groups (Fig. 5). The first axis explained 8.1% and the second axis explained 6.5% of the overall variance, respectively.
The population structure of 392 broccoli accessions was classified by Structure software version 2.3.4. The population number K was set to 1-10, and each hypothetical K value was calculated five times. The ΔK of Evanno was maximal at K = 2 (Fig. 6A). Therefore, the 392 broccoli accessions were divided into 2 groups: POP1 and POP2 (Fig. 6B). POP1 was an improved-type broccoli subgroup consisting of 155 broccoli accessions, and POP2 was an initial-type broccoli subgroup consisting of 237 broccoli accessions. The FST value of POP1 and POP2 was 0.0008, which could strongly explain the genetic distance between these two populations. Comparing with the results of phylogenetic tree, 98 of 103 accessions in cluster I belongs to POP1. All accessions in cluster II, and 223 of 281 accessions in cluster III belong to POP2 (Additional file 4).
Above all, these 100 KASP markers were effective at discriminating the population structure of the accessions. However, different types of accession were overlapped, probably reflecting the fact that breeding activities led to genetic similarities.
Selection core KASP markers for fingerprinting of broccoli accessions
To build a rapid and cost- effective way of variety identification, we selected 25 KASP markers from the genotyping database as the fingerprinting for every accession. These KASP markers were highly effective for distinguishing among 392 examined accessions, and evenly distributed across the nine chromosomes (Fig. 2). For each accession, the genotype based KASP barcode was used to generate corresponding 2D barcode using online tool (available at www.cli.im ). Then this barcode can be scanned to obtain the information used for creating the 2D barcode. Figure 7 depicts barcode of a representative variety of broccoli used in the present study.