Genetic distance and clustering analysis for the population
In this study, both SSR and SNP markers were used to investigate the genetic distance (GD) between parents. A total of 198 polymorphic SSR markers were distributed on 26 chromosomes. There were 557 polymorphic alleles in 286 parents ranged from one to ten alleles per marker with an average of 2.81. For the SNP markers, with a missing rate greater than 30% and minor allele frequency (MAF) less than 5% were eliminated and a total of 76,654 SNPs were obtained. These SNPs distributed on 26 chromosomes and varied in density at different chromosomes and locations (Fig. S1).
The GD between the parents calculated based on SSR markers showed that the GD between four male parents (Zhong7886, A971, 4133, and SGK9708) and 282 female parents varied from 0.139 to 0.387, with an average of 0.279 (Table 1, Table S1). The F1 population which crossed from four male parents was named as population A (Zhong7886), C (A971), D (4133), and E (SGK9708) according to their male parents. The mean value of GD assessed by SSR markers in each F1 populations was E > C > D > A. The GD between parents based on SNP markers showed that the GD varied from 0.137 to 0.375, with an average of 0.242 (Table 1, Table S1). The mean value of GD assessed by SNP markers in each F1 populations was A > E > D > C. The correlation of the GD assessed by SSR and SNP markers was significantly positive (0.264 ≤ r ≤ 0.375, P < 0.01). Furthermore, 1,128 F1 hybrids clustered into five groups based on GD assessed through SNP markers and named as group I, II, III, IV and V, having 144, 176, 304, 224 and 280 F1, respectively (Fig. 1). From the clustering results by SSR, all the F1 hybrids could be clustered into three groups, which contained 536, 468, and 124 F1 hybrids and names as group 1, 2 and 3, respectively. But the clustering results by SSR was not perfectly match the clustering results by SNP. Although we could find that Group 1 in SSR clustering result included the majority crosses which clustered as Group I and Group III by SNP, Group 2 in SSR clustering result was consisted by crosses which clustered as Group III, Group IV, and Group V by SNP, and Group 3 in SSR clustering result included the majority crosses which clustered as Group II by SNP. Moreover, because the number of the SNP marker was significantly larger than SSR marker, so we decided to use the clustering results by SNP to do the further analysis.
Performance of F1 hybrids among different population groups
In this study, according to the cultivated years and origins, all the 286 parents could be divided into three groups, which named Elite cultivars, Historical cultivars, and Exotic cultivars. Elite cultivars were cultivated in China after 2000, Historical cultivars were cultivated in China before 2000 and exotic cultivars were collected from other countries except of China. Therefore, this study included three different sets of cotton hybrids, termed Elite×Elite, Exotic×Elite, and Historic×Elite. The Elite×Elite hybrids showed significant lower GD than the other two hybrids sets (Fig. S3). Furthermore, we evaluated the F1 performance of the Elite×Elite, Exotic×Elite, and Historic×Elite hybrids and made comparisons with parent performances, and the result showed that all the F1 hybrid performance were significantly higher than parents in all the nine traits except of fiber strength (Fig. 2). The lint percentage (LP) decreased significantly from the Elite×Elite to Historic×Elite and Exotic×Elite hybrids. For fiber length (FL) and spinning consistent index (SCI), the mean value of Elite×Elite hybrids was significantly higher than the Historic×Elite hybrids. For fiber strength (FS), the mean value of Historic×Elite hybrids-was significantly lower than the Elite×Elite and Exotic×Elite hybrids. For boll number (BN), micronaire (MIC), fiber uniformity (FU) and fiber elongation rate (FE), the mean value of Elite×Elite hybrids was significantly higher than both the Exotic×Elite and Historic×Elite hybrids. However, no significant differences were observed for plant height (PH) and boll weight (BW) between these three hybrid sets.
From the above clustering result by SNP, we concluded that all 1128 hybrids could be divided into five groups according to the GD, therefore we compared the F1 hybrid performance of the each group and parents (Fig. 3). Firstly, seven traits showed significantly higher values in both five F1 groups than parents except of FL, FE, and FS. Secondly, Group II, IV, and V showed significantly higher LP than group I and III while Group II showed significantly MIC than group I and III. Furthermore, the mean values of group II, III, IV, and V for FL and FE were significantly higher than parent except of group I. For FS, there was no difference between all the F1 hybrids with parents. Finally, there was no significant differences among each F1 groups for SCI, BW, BN, FU, and PH. All these results demonstrated that different groups showed varied performances for concerning trait.
Heterosis performance of F1 hybrids
We compared the mid-parent heterosis (MPH) and best-parent heterosis (BPH) of ten traits in 1128 F1 hybrids and the results showed that the MPH values ranged from -18.2% to 75.9%, whereas the BPH values varied from -31.4% to 47.7%. The mean values of MPH of the ten traits ranged from 0.09% to 14.18%, with an average of 4.36%, and the mean values of BPH ranged from -4.85% to 3.30%, with an average of -0.86%. Generally, the mean BPH values were lower than the MPH values for all traits, and approximately 80.9% and 41.6% of the crosses had positive MPH and BPH, respectively (Fig. 4). Among the different F1 populations, F1 population derived from the male parent A (Zhong7886) had higher MPH and BPH values than the other three F1 populations. As compared to yield-related traits (PH, BW, LP and BN), much less MPH and BPH were found for the fiber quality traits. Almost negligible MPH (-1.81% to 2.76%) and BPH (-2.38% to 1.70%) were observed for FU, suggesting that this trait was mainly controlled by additive effect.
Correlation between parent performance, F1 performance and heterosis
The correlation analysis between the performance of parents and the hybrid performance was studied to investigate the effect of the parents on the performance of the hybrids. The result showed that the correlation between parents and F1s performance was significantly positive (ranged from 0.459 to 0.843) in the ten traits except BW and BN. Therefore, this result suggested that genetic control of these traits was under additive genes, and the performance of parents can be used to predict the hybrid performance of these eight traits except for BW and BN (Fig. 5).
The performance of parents showed significant negative correlation with MPH of PH, BN, MIC, and FU (ranged from -0.127 to -0.670) in all four populations. For BW, FE and SCI, the correlation between parent performance and MPH values showed significant negative association only in population A and E. While, FL showed significant negative correlation between parents and MPH in population D. For LP, significant negative correlation was observed between parents and MPH in population A and D, but showed significant positive correlation in population E. There was no significant correlation observed between parent performance and MPH for FS (Fig. 5).
The correlation statistics between parent performance and BPH showed that only the correlations for MIC (0.184) were significantly positive in all the four populations, but for FS and SCI, the correlations were significantly negative in all the four populations, ranged from -0.260 to -0.589. For LP, the correlation between parents and BPH showed significant positive correlation in population C, D and E. For FU, parents and BPH showed significant negative correlation in group A, C and D. For FE, the correlation between parents and BPH showed significant positive correlation only in group A. The correlation for FL between parents and BPH showed significant negative correlation only in group D. While, PH, BW and BN have both Positive and negative correlations in the four populations (Fig. 5).
Correlation between genetic distance and F1 performance
To understand the effect of genetic distance of the parents on the level of heterosis in hybrids, the correlations between genetic distance and the F1 performance, MPH, and BPH were calculated.
Based on the correlation between the GD of SSR markers and F1 performance, the GDSSR was negatively correlated with BW, LP, BN, FL, MIC, and FU in at least one F1 population, but not significantly correlated with PH, FE and SCI (Table 2). However, GDSSR was positively correlated with FS in the D population. Based on the correlation between the GD of SNP markers and F1s performance, GDSNP was negatively correlated with LP, BN, FL, MIC and FE in at least one F1 population but not significantly correlated with other traits like PH, BW and FS (Table 2). However, GDSNP was only positively correlated with SCI in the C population.
Overall, most of the traits were negatively correlated with GDSSR and GDSNP, and only two traits (FS and SCI) were positively correlated with GDSSR and GDSNP in only one population. Furthermore, GDSNP had more effective power than GDSSR.
Relationship between genetic distance and MPH
The correlation between GD of SSR markers and MPH showed that GDSSR was negatively correlated with FL, FS, MIC, FU, and SCI in population E, but positively correlated with MPH for PH and BW in population E and D, respectively (Table 3). The correlation results between GD of SNP markers and MPH showed that the GDSNP was positively correlated with the MPH of PH, BN, FS and FU in only one population and positively correlated with BW and SCI in two populations (Table 3). For the MPH of LP, the correlation was positive in the D population but negative in population E.
In summary, the overall analysis results of the correlation between GDSSR and GDSNP in the four populations was inconsistent, and the correlation of group E was stronger than that of other groups.
Relationship between genetic distance and BPH
The correlation results between GD of SSR markers and BPH showed that the GDSSR was negatively correlated with the BPH of LP, FL, FS, MIC, FU, and SCI but positively correlated with the BPH of PH (Table 4). From the correlation results between GD of SNP markers and BPH, we observed that the GDSNP was negatively correlated with the BPH of LP, BN, FL, MIC, and FE, and positively correlated with the BPH of PH and BW (Table 4).
In summary, the overall analysis results of the correlation between GDSSR, GDSNP and the BPH of ten traits were consistent. The overall results were consistent with the correlation trends of F1s performance, but the correlation was weak.