Performance of fruit firmness
As observed in the field, M2-10 ripened at ~28 DAP, with a fruit firmness of 10.8 ± 1.24, and ZT091 ripened at ~25 DAP, with a fruit firmness of 3.2 ± 0.32. There was a significant difference between the parental lines (Fig. 1). F1 fruit ripened ~30 DAP, with a fruit firmness of 8.7 ± 1.15, and the average fruit firmness values of the F2 population and F3 families were 5.3 ± 1.12 and 5.93 ± 0.89, respectively. The fruit firmness of each population was continuously distributed, reflecting the fact that it is a quantitative trait controlled by multiple genes (Fig. 2).
SLAF-base sequence
In total, 18.53 million reads were obtained from the high-throughput sequencing of the constructed SLAF library. After filtering, 18.39 Gb of clean bases were acquired. The average Q30 values of the parental lines were 93.03% and 92.88%. The GC contents of the sequenced parental lines were 36.94% and 38.03, respectively. The numbers of total raw reads were 42,916,190 and 45,759,537 for M2-10 and ZT091, respectively, and the numbers of clean reads for the male and female parents were 42,680,203 and 45,492,988, respectively (Table 2). In total, 31,658,000 clean reads were obtained for the high-firmness bulk and 29,713,764 for the low-firmness bulk. The average Q30 was 92.45%, and the GC contents of the high-firmness and low-firmness bulks were 37.32% and 37.85%, respectively. After the tags were aligned with the reference genome, more than 93% of the sequences were mapped to the reference genome. The average sequencing depths were 26.5 X and 18.5 X for parental lines and bulks, respectively (Table 2).
Association analyses
SNP-based association analysis
In total, 12,365 high-quality SNP loci were identified. Based on the Euclidean distance method and the use of the median + 3 SD to select the correlation threshold of all loci, 5.66 was finally selected as the threshold, and five candidate regions were shown to be associated with fruit firmness. These regions covered a total of 88.73 Mb in the associated region and included 5,804 genes, within which 321 nonsynonymous SNP loci were detected. However, since the theoretical threshold was not reached (7.06), this region needs further verification.
SNP index analysis is an association analysis method based on the difference in genotypic frequency between bulks. In the present study, the DISTANCE method was used to analyse the ΔSNP index to select an association threshold. When the threshold was 0.99, eighty-nine associated regions containing 8,892 genes, covering 127.62 Mb of the genome and including 469 nonsynonymous SNP loci, were identified. Based on these SNPs, the results obtained for the associated regions via the two association analysis methods overlapped on chr. 2 chr. 5 and chr. 10 (Table 3 and Fig. 3).
InDel-based association analysis
Before the association analysis, 2,741 high-quality InDel loci were identified. Based on the Euclidean distance method, 5.66 was selected as the threshold, and six associated regions, covering 88.73 Mb and including 5,804 genes, were identified. In this region, 107 genes contained mutant InDel loci. Based on the InDel loci, the Δindel-index method was applied with a selected a threshold of 0.99; within fifteen associated regions containing 8,892 genes and covering 143.08 Mb, 117 genes containing mutant InDel loci were identified.
According to the results of the two index calculation methods, two associated regions (ff2.1 and ff5.1) were identified. One of these regions was located on chr. 2 from 18996333 to 23870331, covering 4.87 Mb, and the other was located on chr. 5 from 49380 to 28751254, covering 28.7 Mb.
Narrowing the mapped locations of QTLs ff2.1 and ff5.1
To narrow the associated regions, SSR markers were used to identify polymorphic loci between the parental lines. Eight and nine SSR polymorphic markers were obtained from 144 F2:3 families (each family contained 15 plants) to narrow the associated regions on chr. 2 and chr. 5, respectively. QTL ff2.1 was located on chr. 2 between CmSSR07709 and SNP22228 and explained 28.14% of the phenotypic variation; the LOD value of ff2.1 was 3.8. The candidate region spanned genomic positions 22122424 to 22228642, covering 106 Kb, and it contained 10 candidate genes (Fig. 4).
QTL ff5.1 was located on chr. 5 between CmSSR13509 and CmSSR13911 and explained 38.44% of the phenotypic variation; the LOD value of ff5.1 was 17.44, which was relatively high and indicated that this QTL was a major QTL for fruit firmness. The candidate region spanned genomic positions 10503296 to 18014796, covering 7.5 Mb. Positive additive effects of ff2.1 and ff5.1 were contributed by the alleles of M2-10 (1.82 and 0.85, respectively; Table 5)
Gene annotation in the candidate region of chr. 2
A candidate region spanning 138 Kb with 10 genes was screened out, but only six genes were annotated in the region. These genes encoded an auxin response factor (ARF) (MELO3C029720), movement protein (MELO3C029506), cell division cycle 5-like protein (MELO3C017522), L-allo-threonine aldolase (MELO3C017520), adenyl sulphate kinase (MELO3C017519) and chloride channel protein (MELO3C029513) (Table S1). To identify the candidate gene expression levels, sarcocarp samples from 15, 20 and 25 DAP in parental lines were subjected to qRT-PCR analysis. None of the candidate genes except for MELOC3C029513 showed a significant difference in expression between the parental lines in the different fruit development stages. Although MELO3C017513 and MELO3C017523 showed no difference in expression between M2-10 and ZT091 at 25 DAP (Fig. 5 a and b), the other genes all showed significant differences in expression at this time point. According to the results for most of the genes, 20 DAP was a key point in fruit development. For example, in the hard-fruit line M2-10, the MELOC3C017518, MELOC3C017523, and MELOC3C029507 genes (of unknown function) all showed decreased gene expression at 20 DAP but highly increased gene expression at 25 DAP (Fig. 5 b-d); the same expression trend was observed for MELOC3C029519, MELOC3C029520, MELOC3C029522, and MELOC3C029506 (Fig. 5 e-h). In the soft-fruit line ZT091, these genes showed interesting patterns of expression. For MELOC3C017519, MELOC3C017522, MELOC3C017523, MELOC3C029506, and MELOC3C017507, gene expression was higher at 20 DAP than at 15 DAP but significantly decreased at 25 DAP. Opposite trends were observed in M2-10 and ZT091 for MELO3C017518 and MELO3C017519, as shown in Fig. 5c and e. There were no significant differences detected between 20 and 25 DAP in the soft-fruit parent ZT091 (Fig. 5c and e). For MELOC3C029508, significant gene expression levels were detected at 15, 20 and 25 DAP, and a decreasing trend was found in M2-10(Fig. 5i). For MELOC3C029720, gene expression was increased in M2-10 at 15, 20 and 25 DAP, but both increasing and decreasing trends of expression were detected in ZT091 (Fig. 5j). For MELOC3C029720 (ARF gene), there were no differences at 15, 20 or 25 DAP in the hard-fruit line M2-10, but a significant difference was detected at 25 DAP in ZT091. In all of the different fruit development stages, significant differences were detected between M2-10 and ZT091 (Fig. 5f). For MELOC3C0217519 (adenylyl-sulphate kinase gene), there were significant differences in gene expression levels among the three fruit development stages in each parental lines and between M2-10 and ZT091 at 15, 20 and 25 DAP (Fig. 5e). For MELOC3C0217522 (cell division cycle 5-like protein related gene), there were no significance differences between 15 and 25 DAP in either M2-10 or ZT091, but a significant decrease in M2-10 and a significance increase in ZT091 were found (Fig. 5g). For each fruit development stage, gene expression was higher in ZT091 than in M2-10, and a significant difference was detected. For the Movement protein gene MELO3C029506, significant gene expression levels were detected at 25 DAP in M2-10 and ZT091, and the same was true for the L-allo-threonine aldolase gene (MELO3C017520) (Fig. 5 g and j).