Characterization of tannin phenotypes in sorghum grains
We assessed the presence of tannins in a diverse panel 421 sorghum accessions from Asia and the North America (Supplementary Table S1). Of these accessions, 236 had tannins in grain, indicated by the black-red stained after bleach solution, For the tannin containing accessions, tannin levels ranged from 0.11–2.46 mg/100 mg, with a mean of 0.86 mg/100 mg (Supplementary Table S1).
To unveil the genetic basis of tannin presence/absence for this large population, we firstly developed KASP assays based on previously identified six recessive alleles (tan1-a, tan1-b, tan1-c, tan2-a, tan2-b, and tan2-c) (Fig. 1A; Supplementary Table S3). The KASP assays were tested on two tannin (P898012 and ShanQuiRed) and five non-tannin accessions (Tx430, Tx623, CK60, IS-6882C and SC51) that together contain all six recessive alleles (Supplementary Table S2). The KASP assay could clear separate of tannin and nontannin accessions (Fig. 1B).
We then screened the 421accessions with the six KASP assays (Supplementary Table S4). We detected five recessive alleles, except tan2-c, were segregating. As expected, all 236 tannin accessions were wildtype at all five alleles. We found that 150 nontannin accessions contained a recessive allele in either Tan1 or Tan2, while 9 accessions carrying recessive alleles at both genes. In this panel, tan1-b (34.6%) and tan1-c (29.7%) were the most frequent allele. A total of 26 nontannin accessions were wildtype at all six sites in Tan1 and Tan2, which suggested that additional recessive alleles in these two genes that may be responsible for the tannin absence (Supplementary Fig. 1).
Novel recessive alleles in Tannin1 and Tannin2
To identify these potential new causal variants, we sequenced the coding regions of Tan1 and Tan2 across the 26 nontannin accessions. In total, we identified three novel recessive alleles, tan1-d, tan1-e, and tan2-d. The causal mutation at tan1-d, only identified in GanShuaZao, is a 12-bp (TCGTCTACGAGA) deletion at position 659 nt (220 aa), which results in a four amino acid deletion between the second and third WD40 repeat domain in Tan1. The tan1-e allele identified in two accessions is a 10-bp (CGACATACGT) deletion at position 771 nt (257 aa), which results in a frame shift at the end of the third WD40 repeat domain that introduces a premature stop at the fourth WD40 repeat domain and causes a 58 amino acid truncation of the C-terminal region (Fig. 2A). The predicted protein structures of both tan1-d and tan1-e indicate there is a disruption of WD40 protein structure compared with wild-type Tan1 (Fig. 2A). The tan2-d allele, present in all other 24 accessions, contains a C-to-T transition at position 7923 nt (456 aa) of Tan2 gene. The C-to-T transition introduces a premature stop codon just before the bHLH domain, which results in a loss of 222 amino acid residues including the whole bHLH domain (Fig. 2B).
To facilitate the screening of these novel alleles in marker-assisted selection and breeding, we developed KASP assays for tan1-d, tan1-e, and tan2-d. Like our previous KASP analyses (Fig. 1B), there was clear separation between wild-type and mutant alleles in all three assays (Fig. 1A&B).
Haplotype analysis of a natural germplasm panel
Because of the duplicated recessive epistasis between Tan1 and Tan2 for tannin presence, after genotyping the whole panel with these three new recessive alleles, we surveyed the geographic distribution of tannin presence and the Tan1;Tan2 haplotype (Fig. 3). All 236 tannin accessions, with majority were collected from China, had dominant alleles in both Tan1 and Tan2. In the 185 nontannin types, 13 different allelic combinations of Tan1 and Tan2 were identified. 151 lines, that were generally distributed in North America and Asia, contained a recessive tan1 and a dominant Tan2 allele (32 tan1-a;Tan2, 63 tan1-b;Tan2, 53 tan1-c;Tan2, 1 tan1-d;Tan2, and 2 tan1-e;Tan2). 25 lines, mainly from Asia, contained a recessive tan2 and dominant Tan1 allele (1 tan2-a;Tan1, 1 tan2-b;Tan1, 23 tan2-d;Tan1). The remaining 9 accessions were mutant at both tan1 and tan2 (5 tan1-a;tan2-a, 1 tan1-a;tan2-d, 1 tan1-b;tan2-d, and 2 tan1-c;tan2-a; Table 1 and Fig. 3).
Table 1
Haplotypes of Tan1 and Tan2 in 421 sorghum accessions
Alleles | Tan2 | tan2-a | tan2-b | tan2-c | tan2-d |
Tan1 | 236 | 1 | 1 | 0 | 23 |
tan1-a | 32 | 5 | 0 | 0 | 1 |
tan1-b | 63 | 0 | 0 | 0 | 1 |
tan1-c | 53 | 2 | 0 | 0 | 0 |
tan1-d | 1 | 0 | 0 | 0 | 0 |
tan1-e | 2 | 0 | 0 | 0 | 0 |
Marker-assisted selection using KASP assays for Tannin1 and Tannin2 in F2 populations
We next evaluated the KASP assays on three biparental F2 populations derived from crosses of Tx615B×Jin5-0, GA1×DaLuoChui, and M-6693×AiJiaoNuo. Tx615B, GA1, and M-6693 are nontannin inbred lines, while Jin5-0, DaLuoChui, and AiJiaoNuo are tannin inbred lines. The segregation of tannin and nontannin plants in all three F2 populations fit a 3:1 ratio (chi-square tests: Tx615B×Jin5-0, P = 0.49; GA1×DaLuoChui, P = 0.68; AiJiaoNuo×M-6693, P = 0.83). The KASP assays indicated that Tx615B, GA1 and M-6693 carry the tan1-b, tan1-a, and tan2-d allele, respectively, whereas Jin5-0, DaLuoChui, and AiJiaoNuo were all wildtype at both Tan1 and Tan2. We next genotyped 123 (Tx615B×Jin5-0), 125 (GA1×DaLuoChui), and 120 offsprings (M-6693×AiJiaoNuo ) using KASP-tan1-b, KASP-tan1-a, and KASP-tan2-d, respectively. All three populations segregated 1:2:1 and heterozygous individuals were clearly distinguishable from both homozygous wild-type and mutant samples (Fig. 4). These results indicate that Tan1and Tan2 KASP assays can be used for marker-assisted selection of tannin or nontannin accessions at early breeding stages.