Phenotypic diversity analysis
In order to evaluate the phenotypic variation of salt tolerance related traits in the association mapping population, we analyzed 3 traits related to salt tolerance in two years. To determine the salt tolerance, the 217 upland cotton cultivars (Additional file 1: Table S1) were treated with 350 mM NaCl and clean water in 2016 and 2017, respectively. Three traits related to salt tolerance were determined, including relative plant height (RPH), relative shoot fresh matter weight (RSFW) and relative shoot dry matter weight (RSDW). The average trait values in each year were used for phenotypic analysis and association mapping in single environment. The mean values, ranges, standard deviations, and coefficients of variation for these salt tolerance related traits in each year are shown in Table 1. Great differences among the variation coefficient (CV) in both years were found for the three traits. Overall, the upland cotton cultivars in this association mapping panel clearly exhibited considerable natural variation in the three traits to salt tolerance and displayed very high genetic diversity.
Table 1
Statistics of various traits related to salt tolerance
Year | Trait | Range | Mean | SD | CV% |
2016 | RPH | 0.189–0.760 | 0.440 | 0.145 | 0.330 |
| RSFW | 0.136–0.762 | 0.411 | 0.150 | 0.365 |
| RSDW | 0.137–0.756 | 0.497 | 0.143 | 0.288 |
2017 | RPH | 0.169–0.780 | 0.445 | 0.143 | 0.321 |
| RSFW | 0.129–0.794 | 0.420 | 0.167 | 0.398 |
| RSDW | 0.172–0.776 | 0.521 | 0.158 | 0.303 |
RPH: relative plant height; RSFW: relative shoot fresh matter weight; RSDW: relative shoot dry matter weight; SD: Standard Deviation; CV: Coefficient of Variation. |
Genetic Diversity Analysis Based On Snp
GBS produced 53,321 high-quality polymorphic SNPs among the 217 upland cotton cultivars, with 41,733 SNPs located in the intergenic intervals and 6,188 SNPs in the coding regions. Of these, 95.8% of the loci (51,060 SNPs) were mapped onto 26 chromosomes of the cotton genome, and were selected for the GWAS analysis (Additional file 2: Table S2). These SNPs were not evenly distributed (Additional file 3: Fig.S1) with an average of 1964 SNPs per chromosome ranging from 863 to 5209. Chromosome A08 had the most SNPs (5209) with the highest SNP density (199 Kb/SNP). D04 had the least SNPs (863), but A02 had the lowest SNP density with 680 Kb/SNP. The genetic diversity of this population varied from 0.224 (A08) to 0.389 (A13). The polymorphism information content (PIC) varied from 0.192 (A08) to 0.305 (A13) (Table 2). The above results indicated a relatively large span in genetic diversity index, and polymorphism information content in the cotton genome.
Table 2
Summary of polymorphic SNPs mapped in 26 chromosomes of Gossypium hirsutum
Chr. | Chr length (Mb) | No. of SNPs | SNP density (Kb/SNP) | Gene diversity | PIC | LD Decay (Kb) r2 = 0.1 r2 = 0.2 |
A01 | 99.9 | 2111 | 473 | 0.294 | 0.235 | 1145 | 990 |
A02 | 83.5 | 1228 | 680 | 0.344 | 0.275 | 1215 | 903 |
A03 | 100.3 | 1490 | 673 | 0.321 | 0.26 | 1385 | 1035 |
A04 | 62.9 | 1138 | 553 | 0.314 | 0.255 | 1104 | 974 |
A05 | 92.1 | 2202 | 418 | 0.346 | 0.277 | 1354 | 1060 |
A06 | 103.2 | 3924 | 263 | 0.232 | 0.199 | 1377 | 1157 |
A07 | 78.3 | 1897 | 413 | 0.319 | 0.259 | 1258 | 959 |
A08 | 103.7 | 5209 | 199 | 0.224 | 0.192 | 1136 | 891 |
A09 | 75 | 1639 | 458 | 0.312 | 0.252 | 1299 | 1028 |
A10 | 100.9 | 2335 | 432 | 0.308 | 0.254 | 1191 | 957 |
A11 | 93.3 | 2027 | 460 | 0.257 | 0.215 | 1194 | 970 |
A12 | 87.5 | 1498 | 584 | 0.306 | 0.247 | 1183 | 915 |
A13 | 80 | 2996 | 267 | 0.389 | 0.305 | 1358 | 1058 |
D01 | 61.5 | 2385 | 258 | 0.318 | 0.261 | 880 | 435 |
D02 | 67.3 | 1914 | 352 | 0.321 | 0.26 | 712 | 599 |
D03 | 46.7 | 934 | 500 | 0.27 | 0.225 | 1046 | 803 |
D04 | 51.5 | 863 | 597 | 0.331 | 0.266 | 953 | 844 |
D05 | 61.9 | 1206 | 513 | 0.308 | 0.25 | 1204 | 816 |
D06 | 64.3 | 2454 | 262 | 0.281 | 0.235 | 1275 | 917 |
D07 | 55.3 | 2293 | 241 | 0.324 | 0.264 | 1124 | 817 |
D08 | 61.9 | 2463 | 251 | 0.306 | 0.248 | 909 | 523 |
D09 | 51 | 1643 | 310 | 0.286 | 0.239 | 1002 | 732 |
D10 | 63.4 | 1517 | 418 | 0.294 | 0.241 | 733 | 663 |
D11 | 66.1 | 1080 | 612 | 0.317 | 0.257 | 846 | 801 |
D12 | 59.1 | 1407 | 420 | 0.276 | 0.228 | 913 | 754 |
D13 | 60.5 | 1207 | 501 | 0.287 | 0.236 | 1325 | 988 |
Total | 1931.1 | 51060 | 427 | 0.303 | 0.248 | 1120 | 869 |
Chr.: Chromosome; PIC: Polymorphism information content. |
Population Structure And Linkage Disequilibrium Analysis
It is important for genome-wide association mapping to control the effect of population structure, because population stratification could eliminate spurious associations between genotypes and phenotypes [26, 27]. STRUCTURE software was used to calculate the Bayesian clustering from K = 1 to 10 for five repetitions. LnP (D) value continued to increase from K = 1 to K = 10 without a significant inflection point (Fig. 1a). However, there was an obvious spike at the value of ΔK = 3 (Fig. 1b), suggesting that the population could be divided into 3 subgroups (Fig. 1c). Taking the corresponding Q matrix at k = 3 as the covariate could reasonably eliminate spurious association effects and improve the association mapping accuracy.
The LD distribution among chromosomes of the 217 upland cotton cultivars was shown in Table 2. The LD decay distance was 869 Kb and 1120 Kb when the r2 dropped to 0.2 and 0.1, respectively. The LD decay distance was not evenly distributed among chromosomes, ranged from 435 Kb (D01) to 1157 Kb (A06) when r2 was set at 0.2 and ranged from 712 Kb (D02) to 1377 Kb (A06) when r2 was set at 0.1, the overall LD decay in the At subgenome was significantly higher than that in the Dt subgenome.
Association Analysis
In order to explore the genetic factors underlying salt tolerance, the mixed linear models (MLMs) was performed by simultaneously accounting for population structure and relative kinship matrix to conduct a GWAS. A total of 25 significant associations (-log10p > 4) with 27 significant SNPs located on chromosomes A05, A07, A08, A09, A10, A11, A12, A13, D02, D03, D06, and D09 were detected for the three salt tolerance related traits in the 2016 and 2017 dataset. Eleven associations with 12 SNPs were detected in 2016 (Fig. 2) and nine associations with 9 SNPs were detected in 2017 (Fig. 3). In addition, five associations with 6 SNPs were detected in both 2016 and 2017 dataset. The phenotypic variance explained (PVE) by individual QTL ranged from 1.29–7.00% (Table 3).
Table 3
Summary of SNPs significantly associated with salt tolerance related traits
Traits | Chr. | Site | Allele | MAF | -Log10(P) | PVE | Environment |
RPH | A10 | 84786908 | C/T | 0.15(T) | 4.27 | 5.67 | 2016 |
A10 | 84851927 | G/A | 0.20(A) | 4.10 | 4.31 | 2016 |
A13 | 1869056 | C/G | 0.33(C) | 4.22, 4.36 | 4.41,4.93 | 2016, 2017 |
A07 | 90682411 | C/T | 0.12(T) | 4.35 | 4.58 | 2017 |
A09 | 62356818 | A/G | 0.19(A) | 4.13 | 7.00 | 2017 |
A12 | 74098493 | G/A | 0.49(A) | 4.58 | 5.51 | 2017 |
D02 | 38449756 | T/A | 0.47(T) | 4.01 | 1.50 | 2017 |
D03 | 18961034 | T/G | 0.11(G) | 4.18 | 2.54 | 2017 |
D08 | 49014753 | C/T | 0.13(T) | 4.34, 4.16 | 4.35,3.83 | 2016, 2017 |
D08 | 49080865 | G/A | 0.13(A) | 4.34, 4.16 | 4.35,3.83 | 2016, 2017 |
RSFW | A08 | 5515194 | C/T | 0.10(T) | 4.79 | 4.23 | 2016 |
A10 | 84786908 | C/T | 0.15(T) | 4.19 | 2.95 | 2016 |
D06 | 19966743 | C/G | 0.5(G) | 4.09 | 3.56 | 2016 |
A07 | 90532061 | C/G | 0.13(G) | 4.73 | 5.23 | 2017 |
A07 | 90682411 | C/T | 0.12(T) | 4.13, 4.73 | 2.40, 5.25 | 2016, 2017 |
A08 | 41804418 | T/A | 0.09(T) | 4.23 | 2.74 | 2017 |
A08 | 53916411 | G/A | 0.088(G) | 4.15 | 2.35 | 2017 |
| A11 | 117938337 | G/A | 0.32(A) | 4.09 | 3.08 | 2017 |
RSDW | A05 | 94406259 | C/T | 0.33(T) | 4.17 | 2.92 | 2016 |
A08 | 5589045 | G/A | 0.19(A) | 4.67 | 4.85 | 2016 |
A08 | 6254890 | T/G | 0.22(G) | 4.36,4.66 | 3.52,4.33 | 2016,2017 |
A13 | 42992196 | A/T | 0.48(T) | 4.27,4.28 | 2.79,3.03 | 2016,2017 |
D08 | 47703450 | C/T | 0.096(T) | 4.35 | 1.93 | 2016 |
D08 | 48527413 | C/A | 0.12(A) | 4.23 | 1.29 | 2016 |
D08 | 48911757 | C/G | 0.13(G) | 4.44 | 2.62 | 2016 |
| D08 | 49014753 | C/T | 0.13(T) | 5.20 | 3.88 | 2016 |
| D08 | 49080865 | G/A | 0.13(A) | 5.20 | 3.88 | 2016 |
Chr.: Chromosome; RPH: relative plant height; RSFW: relative shoot fresh matter weight; RSDW: relative shoot dry matter weight; MAF: Minor Allele Frequency; PVE: phenotypic variance explained. |
For RPH, nine significant associations with 10 SNPs were identified on chromosomes A07, A09, A10, A12, A13, D02, D03, and D08. The PVE ranged from 1.50–7.00%. Moreover, the association with one SNPs on chromosome A13 and the association with two SNPs on chromosome D08 were detected in both two years.
For RSDW, eight significant associations with 9 SNPs on A05, A08, A13, and D08 were detected. The PVE ranged from 1.29–4.85%. The association with one SNPs on chromosome A08 and the association with one SNP on chromosome A13 were detected in both two years.
For RSFW, eight associations with 8 significant SNPs on A07, A08, A10, A11 and D06 were detected in 2016 and 2017. The PVE ranged from 2.35–5.25%. The association with three SNPs on chromosome A07 was detected in both two years.
Pleiotropy was also found in our association analysis. For example, the significant SNP A07_90682411 was simultaneously detected associated with RPH in 2017 and RSFW in both two environments. The significant SNP A10_84786908 was simultaneously detected associated with RPH in 2016 and RSFW in 2016. The significant SNP D08_49014753 and D08_49080865 simultaneously detected associated with RPH in 2016, 2017, and RSDW in 2016. In addition, the associations on A08 for RSFW and RSDW in 2016 were only 70 kb apart. Their confidence interval overlapped for each other.
Identification And Preliminary Function Verification Of Candidate Genes
In general, the LD decay could be used as confidence interval to identify candidate genes. Because the cotton genome has a large LD decay [28, 29], we extracted potential candidate genes within 100 Kb of flanking significant markers on the basis of the published upland cotton genome sequencing database [30]. A total of 156 genes were identified in these intervals (Additional file 4: Table S3). Gene ontology (GO) enrichment analysis indicated that “dioxygenase activity”, “oxidoreductase activity, acting on single donors with incorporation of molecular oxygen, incorporation of two atoms of oxygen” and “oxidoreductase activity, acting on single donors with incorporation of molecular oxygen” were significantly enriched using an FDR adjusted P-value of ≤ 0.05 as the cutoff. Of the 156 genes, 12 were differentially expressed between salt tolerance variety Miscott7913-83 and salt sensitive variety Su 12 according to the previous transcriptome sequencing results [31] (Additional file 4: Table S3). Some of these genes may be associated with salt stress, such as GH_A08G0488 and GH_A10G1620 encoding protein kinase. Protein kinase has been proved to play an important role in salt tolerance in cotton [32]. Another gene GH_A13G0171, which encodes aquaporin, was also likely to regulate the salt stress response [33]. The confidence interval contains GH_A13G0171 was simultaneously detected in both years. Through a t test, we found that the salt tolerance of upland cotton cultivars with the G haplotype was significantly higher than that of cultivars with the C haplotype in both years (Fig. 4).
The three promising genes (GH_A08G0488, GH_A10G1620 and GH_A13G0171) were selected to conduct preliminary function verification of salt tolerance in cotton. Analysis of gene expression patterns could provide important clues for gene function determination. Quantitative RT-PCR was performed to analyze the expression levels of GH_A08G0488, GH_A10G1620 and GH_A13G0171 in roots and leaves under salt stress treatment in salt tolerance variety Miscott7913-83 and salt sensitive variety Su 12. As shown in Fig. 5, the three genes were induced by salt stress and displayed distinct expression patterns in response to salt stress in salt tolerance variety Miscott7913-83. The three genes had a much higher expression level in roots than in leaves. The gene GH_A13G0171 exhibited a significantly down-regulated expression in both root and leaf tissues after salt stress. The gene GH_A08G0488 exhibited a significantly up-regulated expression in both root and leaf tissues. The expression level of GH_A10G1620 showed an increase in leaf and no significant changes in the root tissues. As shown in Fig. 6, the three genes were also displayed distinct expression patterns in response to salt stress in sensitive variety Su 12. The gene GH_A13G0171 exhibited an identical expression pattern in Miscott7913-83 and Su 12. The expression levels of GH_A10G1620 and GH_A08G0488 were not significant different.
To confirm the functional roles of GH_A08G0488, GH_A10G1620 and GH_A13G0171 genes under salt stress, virus-induced gene silencing (VIGS) assay was used to repress expression of these genes in salt tolerance variety Miscott2111 plants. The inoculated seedlings were grown in three light incubators at 23 °C under a 16-h light and 8-h dark cycle as three biological replicates. At the developmental period when two leaves had formed, the pTRV2 :: GH_A08G0488, pTRV2 :: GH_A10G1620, pTRV2:: GH_A13G0171 and pTRV2::00 inoculated plants were treated with 350 mM NaCl. After 15 days, the plant height, fresh and dry shoot matter weight were determined and the corresponding relative values were calculated. The transcripts of the three genes in the VIGS leaves were significantly reduced compared to the untreated Miscott2111 plants, indicating that they were effectively silenced in VIGS plants (Additional file 5: Fig.S2). Compared with the control pTRV2::00, no significance effect on their phenotypes was observed in pTRV2 :: GH_A08G0488 and pTRV2::GH_A10G1620 inoculated plants, and the plant height, fresh and dry shoot matter weigh of GH_A13G0171-silenced plants were significantly higher than that of pTRV2::00 inoculated plants (Table 4; Fig. 7), indicating that GH_A13G0171 could reduce seedling tolerance to salt stress. The corresponding markers SNP_A13_1869056 could be applied for improvement of cotton salt tolerance.
Table 4
Statistics of traits related to salt tolerance for the VIGS lines and its control at the seedling stage
| RPH | RSFW | RSDW |
pTRV2::00 | 0.512 ± 0.033 | 0.497 ± 0.027 | 0.534 ± 0.032 |
pTRV2::GH_A13G0171 | 0.591 ± 0.022* | 0.614 ± 0.032* | 0.657 ± 0.029* |
pTRV2::GH_A08G0488 | 0.481 ± 0.029 | 0.474 ± 0.033 | 0.506 ± 0.036 |
pTRV2::GH_A10G1620 | 0.473 ± 0.026 | 0.494 ± 0.024 | 0.515 ± 0.030 |
The data are mean ± SE of three biological replicates. *indicate statistical significance at the 0.05 probability level. |
pTRV2::00: Negative control; pTRV2::GH_A13G0171: GH_A13G0171 knockout seedlings; pTRV2::GH_A08G0488: GH_A08G0488 knockout seedlings; pTRV2::GH_A10G1620: GH_A10G1620 knockout seedlings. |