Responses of the seven rice lines to salt stress
The seven rice lines were further screened with three salt concentrations (i.e., 50, 100, and 150 mM) along with controls. All seven lines were evaluated as tolerant or sensitive based on the modified Standard Evaluation System (SES) score (Gregorio et al. 1997) by comparing the visual response of leaves of these genotypes with control checks under 150 mM NaCl (Tariq et al. 2019). Under this salt stress, SAL-9 (GSR IR1-8-S9-D1-Y1 = HHZ8-SAL9-DT1-Y1) and SAL-10 (GSR IR1-5-S10-D2-D1 = HHZ5-SAL10-DT2-DT1) were evaluated as tolerant because the leaves showed no sign of salt injury. In contrast, SAL-11 (GSR IR1-5-S10-D1-D1 = HHZ5-SAL10-DT1-DT1) and SAL-12 (GSR IR1-11-S6-Y1-Y1 = HHZ11-SAL6-Y1-Y1) were evaluated as sensitive because of the sign of salt injury appeared on the leaves and seedlings become dead (Table 1).
FL478 was proven to be highly tolerant of high-saline environments (Thomson et al. 2010), while IR64 is sensitive to saline conditions (Palao et al. 2013). We obtained similar results under the 150 mM NaCl stress. These results further confirmed the responses of the rice lines to salt stress. The contrasting responses of the seven rice lines to salt stress caused us to examine their DNA polymorphism at the genome level.
Genome sequencing and sequence read mapping
A total of 7,980,092 high-quality reads were obtained for the seven rice lines, with an average of 1.14 million high-quality reads per line and an average read length of 93 bp (Fig. 1). A quality check showed that 95.71% of the reads had a quality of ≥ Q20. Because both flanking sequences of the BamHI sites were sequenced and it was estimated that there is one BamHI site in approximately 8.5 kb of the 400-Mb rice genome, the sequence reads for a line had average coverage of 12.1x. The average coverage for SAL-9, SAL-10 SAL-11, SAL-12, FL478, Pokkali and IR64 was 13.0x, 11.5x, 11.5x, 13.0x for, 11.5x, 10.5x, and 13.0x (Table 2). Approximately 83% of the reads from each line were mapped to the rice Nipponbare reference genome, while the remaining 17% of the reads were unmapped. The average range of mapped reads was between 66% for Pokkali and 87% for SAL-9 (Fig. 1).
Table 2
Statistics of genotyping-by-sequencing results and SNP and InDel variant discovery
Rice line | SAL-9 | SAL-10 | SAL-11 | SAL-12 | FL478 | Pokkali | IR64 | Total | Average |
Read length (bp) | 95 | 92 | 95 | 93 | 93 | 90 | 94 | 652 | 93 |
Mapped reads (%) | 87 | 83 | 85 | 86 | 86 | 66 | 86 | 579 | 83 |
Read depth (x) | 13.0 | 11.5 | 11.5 | 13.0 | 11.5 | 10.5 | 13.0 | 84 | 12 |
Raw variants | 32,908 | 27,263 | 36,587 | 32,747 | 36,259 | 22,347 | 37,111 | 225,222 | 32,175 |
Filtered SNPs | 11,417 | 10,532 | 13,803 | 12,038 | 10,151 | 6,757 | 15,461 | 80,159 | 10,146 |
Filtered InDels | 857 | 865 | 1,115 | 997 | 778 | 533 | 1,258 | 6,403 | 915 |
Sequence in Mb | 227.6 | 256.4 | 294.4 | 246.9 | 213.7 | 210.4 | 323.7 | 1,773 | 253.2 |
Distribution of SNPs and InDels in the rice genome
A total of 225,222 raw variants were identified in the seven rice lines when the rice Nipponbare genome was used as the reference. The raw variants were filtered with a minimum read depth of 10x and mapping quality of ≥ 30. Consequently, we obtained a total of 80,159 high-quality SNPs and 6,403 high-quality InDels (Table 2). The largest number of variants was found for chromosome 1 (9,852) and the smallest for chromosome 9 (5,328). The largest numbers of SNPs and InDels were present in IR64 (15,461 and 1,258) and SAL-11 (13,803 and 1,115), while the smallest numbers were present in Pokkali (6,757 and 533) and FL478 (10,151 and 778) (Fig. 2). The average number of variants for each chromosome was 12,366, with one variant for every 30,183 bases (Table S1). Only 186 of the SNPs were common between FL478 and IR64, and 38 were common among FL478, IR64, and Pokkali. IR64/SAL-9 and IR64/SAL-10 had 193 and 143 common SNPs, respectively, while FL478/SAL-9 and FL478/SAL-10 had 213 and 140 common SNPs, respectively. IR64/SAL-11 and IR64/SAL-12 had 228 and 193 common SNPs, respectively; however, FL478/SAL-11 and FL478/SAL-12 had 188 and 223 common SNPs, respectively (Fig. 3).
The average SNP density across the seven rice lines was 3.1/100 kb. Chromosome 10 had the highest SNP density, with average 3.7/100 kb, while chromosome 4 had the lowest SNP density, with average 2.39/100 kb, and only SAL-11 had the lowest SNP density on chromosome 12 which is 2.84/100 kb. The highest average of SNP density was observed in IR64 and SAL-11, which is 4.17/100 kb and 3.71/100 kb, while the lowest average of SNP density was observed in Pokkali and SAL-10, which is 1.82/100 kb and 2.82/100 kb (Table 3).
Table 3
SNP density among the seven rice lines and their chromosomes
Chrom. | SAL-9 | SAL-10 | SAL-11 | SAL-12 | FL478 | Pokkali | IR64 | Average |
SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs | SNPs/100 kb | SNPs/100 kb |
1 | 1405 | 3.25 | 1261 | 2.91 | 1431 | 3.31 | 1347 | 3.11 | 1140 | 2.63 | 710 | 1.64 | 1719 | 3.97 | 2.97 |
2 | 1146 | 3.19 | 1134 | 3.16 | 1483 | 4.13 | 1173 | 3.26 | 971 | 2.70 | 709 | 1.97 | 1544 | 4.30 | 3.2 |
3 | 1208 | 3.32 | 1081 | 2.97 | 1441 | 3.96 | 1289 | 3.54 | 1066 | 2.93 | 679 | 1.86 | 1621 | 4.45 | 3.29 |
4 | 789 | 2.22 | 819 | 2.31 | 1073 | 3.02 | 910 | 2.56 | 690 | 1.94 | 473 | 1.33 | 1205 | 3.39 | 2.39 |
5 | 831 | 2.77 | 751 | 2.51 | 1121 | 3.74 | 905 | 3.02 | 756 | 2.52 | 594 | 1.98 | 1059 | 3.53 | 2.86 |
6 | 1088 | 3.48 | 991 | 3.17 | 1305 | 4.18 | 1117 | 3.57 | 968 | 3.10 | 638 | 2.04 | 1416 | 4.53 | 3.43 |
7 | 876 | 2.95 | 840 | 2.83 | 1126 | 3.79 | 994 | 3.35 | 831 | 2.80 | 572 | 1.93 | 1216 | 4.09 | 3.1 |
8 | 819 | 2.88 | 747 | 2.63 | 1085 | 3.81 | 863 | 3.03 | 826 | 2.90 | 468 | 1.65 | 1202 | 4.23 | 3.01 |
9 | 706 | 3.07 | 590 | 2.56 | 809 | 3.52 | 801 | 3.48 | 675 | 2.93 | 420 | 1.83 | 942 | 4.09 | 3.06 |
10 | 883 | 3.80 | 818 | 3.52 | 1021 | 4.40 | 895 | 3.86 | 726 | 3.13 | 512 | 2.21 | 1208 | 5.21 | 3.73 |
11 | 930 | 3.20 | 850 | 2.93 | 1125 | 3.88 | 971 | 3.35 | 847 | 2.92 | 596 | 2.05 | 1263 | 4.35 | 3.24 |
12 | 736 | 2.67 | 650 | 2.36 | 783 | 2.84 | 773 | 2.81 | 655 | 2.38 | 386 | 1.40 | 1066 | 3.87 | 2.61 |
Total/ average | 11,417 | 3.07 | 10,532 | 2.82 | 13,803 | 3.71 | 12,038 | 3.25 | 10,151 | 2.74 | 6,757 | 1.82 | 15,461 | 4.17 | 3.1 |
Annotation of variants
Of the SNP and InDel variants, 35,846 were found in genic regions, while the remaining 50,140 variants were found in intergenic regions. Out of the 35,846 variants present in the genic regions, 12,850 were present in introns; 2,228 and 2,106 were present in 5’UTRs and 3’UTRs, respectively; and 18,662 were present in coding sequences (CDSs). The SNP variants in the CDSs were divided into synonymous and non-synonymous. It was found that 10,990 of the SNPs in the CDSs were non-synonymous (nsSNPs) and resulted in changed amino acids. The number of nsSNPs was higher than the synonymous SNPs (7,672) in all seven rice lines analyzed (Table 4). There were 2,101 nsSNPs and InDels common between FL478 and IR64 and 1,372 nsSNPs polymorphic between the two. A total of 566 nsSNPs in the seven rice lines had deleterious effects (SIFT score < 0.05) on protein function. The transition (Ts) frequency (112,686) was much higher than that of transversions (Tv) (47,632). The Ts/Tv ratio of the SNPs ranged from 2.2 for SAL-12 to 2.5 for IR64, with an average of 2.0 (Fig. 4).
Table 4
Distribution of SNP and InDel variants in the coding and non-coding regions of the genomes of the seven rice lines
Variant class | SAL-9 | SAL-10 | SAL-11 | SAL-12 | FL478 | Pokkali | IR64 | Total |
5’UTR | 340 | 307 | 372 | 328 | 297 | 197 | 387 | 2,228 |
3’UTR | 291 | 295 | 365 | 346 | 283 | 175 | 351 | 2,106 |
Intron | 1,760 | 1,735 | 2,263 | 1,989 | 1,531 | 1,175 | 2,397 | 12,850 |
Non-synonymous | 1,604 | 1,440 | 1,869 | 1,649 | 1,411 | 992 | 2,025 | 10,990 |
Synonymous | 1,136 | 1,036 | 1,294 | 1,140 | 980 | 695 | 1,391 | 7,672 |
Intergenic | 6,987 | 6,444 | 8,551 | 7,388 | 6,267 | 4,547 | 9,956 | 50,140 |
Total | 12,118 | 11,257 | 14,714 | 12,840 | 10,769 | 7,781 | 16,507 | |
Effects of nsSNPs on protein functions of abiotic stress-related loci
We identified 116 abiotic stress-related gene loci (between FL478 and IR64) containing a total of 138 nsSNPs in their CDSs, with 1 to 3 nsSNPs per locus, having effects on protein functions (Table S2). Among the 138 nsSNPs, 15 were highly deleterious (SIFT score = 0.0) for protein functions.
Genotyping matrices were generated on the nsSNPs. Three nsSNPs were identified in the coding region of LOC_Os02g12820 (helix-loop-helix DNA binding domain-containing protein) on chromosome 2. However, only one of them had a deleterious effect on protein function. This locus had ‘A’ for the nsSNP at position 6749649 for all three salt stress-sensitive lines (IR64, SAL-11, and SAL-12), while it had ‘G’ for the nsSNP at this position for all four salt stress-tolerant lines (FL478, Pokkali, SAL-9, and SAL-10), and the same ‘G’ nucleotide was present in the reference genome (Fig. 5-a). The presence of the T/C nsSNP at position 2927171 between the tolerant lines (FL478, Pokkali, SAL-9, and SAL-10) and sensitive lines (IR64, SAL-11, and SAL-12) in the “flowering locus T” gene (LOC_Os06g06300) was predicted to be deleterious and changed the function of the corresponding amino acid (Fig. 5-b). Further, the T/C nsSNP is present in all three salt stress-sensitive lines, at position 4835996 in LOC_Os10g08940 (Fig. 5-d). Moreover, we also detected a G/A nsSNP in FL478 and Pokkali (the salt stress-tolerant lines) at position 31016149 in the potassium channel protein (LOC_Os03g54100), even though the other five rice lines had no nsSNP at this position (Fig. 5-c). These two lines (FL478 and Pokkali) had similar phenotypic expression under salt stress (Table 1). In addition, we observed a G/A nsSNP only in the salt stress-sensitive line IR64, in the “white-brown complex homolog protein 11” gene (LOC_Os12g22284), but no SNP was identified in the other six lines. This gene may be playing a role in transport and response to abiotic stress.
Genetic diversity among the seven rice lines
A phylogenetic tree was constructed using nsSNPs present in the coding regions. Figure 6 shows the genetic relationships among the seven rice lines studied. Two distinct groups were identified that corresponded to the tolerant and sensitive lines, respectively. The tolerant lines SAL-9 and SAL10 were closely related to FL478 and Pokkali, while the sensitive lines SAL-11 and SAL-12 were closely related to the sensitive line IR64 (Fig. 6).