Plant materials and methods
A population consisting of 215 RILs derived from a cross between No. 3379 and No. 2839 was used for mapping. No. 3379 is an early-flowering spring-type B. napus, and No. 2839 is a late-flowering spring-type B. napus. No. 3379 usually blooms approximately 25-30 days sooner than does No. 2839. The RIL population was constructed as follows: No. 2839 was crossed with No. 3379 to obtain F1 plants. The F1 plants were self-pollinated to obtain F2 plants. Single-seed descendants were then propagated from the F2 generation to the F5 generation to obtain a RIL population consisting of 215 lines.
Field trials were conducted during the winter of 2017 at Yunnan and during the summer of 2018 at Qinghai, resulting in a total of 4 environments. The planting environment in winter 2017 included Yuanmou (898 m above sea level , 25°42'E, 101°52'N, Yunnan Province), and those in summer 2018 included Ledu (1978 m above sea level, 36.47°E, 102.43°N, Qinghai province), Xining (2225 m above sea level, 36°34'E, 101°49'N, Qinghai province) and Huzhu (2557 m above sea level, 36°49E, 101°57N, Qinghai province). The RIL population, F1 population and the parents (No. 3379 and No. 2839) were evaluated as part of a randomized complete block design with three replicates in each environment. The experimental plot comprised two rows, each 1.5 m in length and 0.3 m in width. There were ten plants per row, and the spacing was 866 m2. The fields were managed in accordance with local standard agricultural practices.
Phenotypic evaluations
The flowering time was calculated by subtracting the date of seedling emergence from the flowering date. The date of cotyledon flattening was recorded as the seedling emergence date of each plant. The seedling emergence in a single plot was considered the date when 75% of the plants in the plot reached seedling emergence. The date of the first flower per plant was recorded as the flowering date of each individual, and the flowering time in a single plot was considered the date when 25% of the plants in the plot had flowered.
Statistical analysis of phenotypic data
Various modules of the Statistical Analysis System (SPSS 19.0) software package were used to analyse the phenotypic data. Broad-sense heritability (h2) was calculated as σ2g / (σ2g + σ2ge / n + σ2 e/ nr) × 100%, where σ2g is the genetic variance, σ2e is the error variance, σ2ge is the interaction variance between genotypes and environments, n is the number of environments, and r is the number of replications in each environment. The estimates of σ2g, σ2ge and σ2e were obtained from a two-way ANOVA using the general linear model (GLM) procedure in SAS 8.1. In the variance analysis model, the variances of the genotypes, environments and interactions between the genotypes and environments were considered random effects. Correlations were calculated for the flowering time between the four environments by SAS 8.1 software.
Library construction and high-throughput sequencing
The genomic DNA of the 215 RILs and their parents was extracted from fresh leaves (100 mg) using the cetyl-trimethylammonium bromide (CTAB) method. The quality of the DNA samples was measured, and the fragments was randomly interrupted by sonication to a size of 200-500 bp. The sequencing libraries were constructed by terminal repair followed by the addition of a 3’A and a sequencing adapter. The resulting fragments were further purified and amplified by PCR. After quality inspection, the constructed library was subjected to paired-end sequencing by an Illumina HiSeq 2500 system (Illumina, San Diego, CA, USA) following the manufacturer’s instructions. Re-sequencing was performed by the Biomarker Technologies Company (Beijing, China).
Identification of SNPs and genotyping
The original reads were processed in accordance with the following quality control steps: 1) the sequences of the adapter were removed; 2) reads with 10% or more unidentified nucleotides were removed; and 3) reads having more than 50% bases with a phred quality lower than 10% were removed. Burrows Wheeler Aligner (BWA) was subsequently applied to align the clean reads of each material to the B. napus reference genome (http://www.genoscope.cns.fr/brassicanapus/).
Potential PCR duplications were eliminated using the MARK duplicate tool of Picard [44]. The GATK software toolkit 3.8 was then used to detect potential SNPs among the lines [45]; this process included base recalibration, variation calling, and strict filtering of SNPs to obtain a final SNP cluster. SnpEff software was used for variation (SNP and small InDel) annotation and for the prediction of variation effects [46].
To ensure the quality of the genetic map, markers were screened in accordance with the following three criteria: 1) only the aa´bb genotype was kept based on the SNPs of two parents, 2) the SNPs with a depth ≥ 1 in each parent were selected, and 3) the SNPs that were not located on chromosomes were removed.
Linkage map construction
Filling and correction of SNPs were carried out using a sliding window analysis, with 15 SNPs per window and 1 SNP per increment. The genotype of the window was identified as aa when there were 11 or more SNPs with the aa genotype in a sliding window. When the genotype of 11 or more SNPs in the sliding window was bb, the genotype of this window was classified as bb. Except for the above 2 cases, the genotype of a window was otherwise identified as ab.
Bin partitioning was subsequently performed according to the sub-reorganization. Each sample was arranged according to its physical position on the chromosome. When there was a genotype transition in any sample, a recombination breakpoint appeared. The SNP between the recombination breakpoints was included in the bin, and it was supposed that no recombination events occurred in the bin. The bins were subsequently used as mapping markers for genetic map construction [47].
According to the distribution of markers, bins with a length less than 10 kb were removed from LGs A09, C03, C04, and C07, and bins with a length less than 5 kb were removed from the other LGs. The markers showing severe segregation distortion were filtered by specific parameters. The filtration parameter for A01, C03, C04 and C07 is 0.001; the parameter for A03, A04, A05, A10, C01, C05, C06 and C09 is 0.01; the parameter for A06, A07, A08 and C08 is 0.00001; and the markers on the remaining chromosomes were not filtered.
All bin markers of the linkage map corresponded to the physical sequences of the reference genome of B. napus. The collinearity between the genetic location and physical location was determined by mapping the relationship between the genetic marker positions (cM) and the physical positions (Mb).
QTL identification and analysis
QTL mapping was performed by the composite interval mapping (CIM) method in Windows QTL Cartographer 2.5 software [48]. The LOD thresholds for determining significant loci were estimated from 1000 permutations [49]. The LOD threshold was set to 2.5 to detect significant QTLs for flowering time. The confidence interval for each QTL was defined by a LOD change from the peak position – when multiple peaks are close to each other and within a half LOD distance. The identified QTLs were named according to the different environments and chromosome numbers [50, 51]. The consensus QTLs were integrated by a meta-analysis by BioMercator 2.1 software [52]. If a QTL was identified only in one environment, then that QTL was considered an environment-specific QTL. If a QTL was identified simultaneously in the spring and winter ecological conditions or in multiple spring environments, those QTLs were thus located at a designated position on the same chromosome and were treated as environmentally stable consensus QTLs. A QTL explaining more than 10% of the phenotypic variance was considered a major QTL.
Select candidate gene
The candidate interval of the major QTLs with overlapping physical regions in specific environments were selected for subsequent analysis. Based on the positions of the flanking bin markers, all of the genes within the confidence interval were identified as candidates, and the sequences of the candidate regions were subsequently submitted to the B. napus genome resource (http://www.genoscope.cns.fr/brassicanapus/), the BRAD (http://www.brassicadb.org/brad/) and the Arabidopsis Information Resource (TAIR) (https://www.arabidopsis.org/) databases for BLAST analysis. The candidate genes of the target loci regions were predicted by combining the gene annotation of the BnaA02g12130D reference genome and the homologous gene annotation related to the flowering time of A. thaliana.