The RNA-seq of circRNAs in B. rapa
Total 15 RNA-seq libraries (five samples, ck-0d, Pb4-8d, Pb4-23d, PbE-8d, PbE-23d, with three biological replicate, SUB9677054) were constructed and sequenced. A total of 2.07 ×1011 raw reads were obtained. After removing low-quality reads, adaptor reads, and rRNA reads, a total of 1.3 billion clean reads were generated and mapped to B. rapa reference genome. The ratios of uniquely mapped reads range from 58.73 % to 77.91 % (Table 2) and only high-quality clean reads were used further circRNA analysis.
Table 2. Detailed information of sequenced data for each sample
Sample
|
Total Reads
|
Unmapped Reads
|
Unique Mapped Reads
|
Multiple Mapped reads
|
Mapping Ratio
|
ck-0d-1
|
94830544
|
20944005 (22.09%)
|
73261939 (77.26%)
|
624600 (0.66%)
|
77.91%
|
ck-0d-2
|
83307918
|
19608143 (23.54%)
|
63175699 (75.83%)
|
524076 (0.63%)
|
76.46%
|
ck-0d-3
|
87178800
|
21098697 (24.20%)
|
65551669 (75.19%)
|
528434 (0.61%)
|
75.8%
|
Pb4-8d-1
|
79369378
|
18691738 (23.55%)
|
60148368 (75.78%)
|
529272 (0.67%)
|
76.45%
|
Pb4-8d-2
|
87104958
|
29049743 (33.35%)
|
57582187 (66.11%)
|
473028 (0.54%)
|
66.65%
|
Pb4-8d-3
|
91092828
|
25243306 (27.71%)
|
65297978 (71.68%)
|
551544 (0.61%)
|
72.29%
|
PbE-8d-1
|
92293366
|
25751295 (27.90%)
|
65994329 (71.50%)
|
547742 (0.59%)
|
72.1%
|
PbE-8d-2
|
90149442
|
24266260 (26.92%)
|
65323398 (72.46%)
|
559784 (0.62%)
|
73.08%
|
PbE-8d-3
|
95240094
|
24798804 (26.04%)
|
69841392 (73.33%)
|
599898 (0.63%)
|
73.96%
|
Pb4-23d-1
|
73820444
|
25455609 (34.48%)
|
48002827 (65.03%)
|
362008 (0.49%)
|
65.52%
|
Pb4-23d-2
|
86986186
|
35568332 (40.89%)
|
51024370 (58.66%)
|
393484 (0.45%)
|
59.11%
|
Pb4-23d-3
|
90911480
|
34229301 (37.65%)
|
56256863 (61.88%)
|
425316 (0.47%)
|
62.35%
|
PbE-23d-1
|
96886026
|
35891103 (37.04%)
|
60538395 (62.48%)
|
456528 (0.47%)
|
62.96%
|
PbE-23d-2
|
88712612
|
35756006 (40.31%)
|
52569290 (59.26%)
|
387316 (0.44%)
|
59.69%
|
PbE-23d-3
|
105084474
|
43371402 (41.27%)
|
61241798 (58.28%)
|
471274 (0.45%)
|
58.73%
|
Identification and validation of circRNAs
Total 1636 novel circRNAs were detected from our circRNA-seq data and named from novel_circ_000001 to novel_circ_001636 after BLAST searches against the circBase database (Glažar er al., 2014; Xiang et al., 2018; Lv et al., 2020).
According to the genome origination, the 1636 circRNAs were classified into six types, including annot_exons, one_exon, intronic, exon_intron, intergenic, and antisense, containing 100, 342, 13, 408, 201, and 572 circRNAs, respectively (Figure 1A). The number of circRNAs encoded by antisense and intronic regions accounting for the most and the least (34.9% and 0.7%), respectively. The annot_exons, one_exon, and exon_intron-type circRNAs containing the exon sequence account for most of the 1636 circRNAs (approximately 51.96%). These results are consistent with the studies in other species, such as Arabidopsis thaliana and Oryza sativa, whose circRNAs originated from exons of a single protein-coding gene, accounting for 50.5% and 85.7%, respectively (Ye et al., 2015). The distribution of these circRNAs on the chromosome of B.rapa ranged from 81 to 255 (Figure 1B). For example, 255 circRNAs from chromosome A06 accounted for the most (15.59%), followed by chromosome A01 and chromosome A09. The length distribution of these circRNAs ranged from 101 to 2,000 bp. Furthermore, the largest number of circRNAs was the length ranged from 201 to 300 bp (Figure 1C).
To validate the circular RNAs detected from RNA sequencing, several circRNA were randomly selected for polymerase chain reaction amplification. A pair of convergent primers were designed to amplify the linear DNA fragments when cDNA or genomic DNA was used as templates in the PCR reactions. Unlike convergent primers, the reverse primers of the divergent primers were located upstream of the forward primers (Figure 2A). The amplification products were not detected in genomic DNA samples with divergent primers (Figure 2B, the original gel image is shown in additional file 1 ). On the contrary, there were no amplification products in the cDNA digested by RnaseR with convergent primers. After confirmation by PCR reaction, sequencing analyses were used to confirm the junction sites of PCR products (Figure 2C). In total, 30 circRNAs were validated by PCR reactions. 28 of the 30 circRNAs (93.3%) were validated in our experiments. Next, we randomly selected 15 circRNAs at a different stage for qRT-PCR to validate the expressing levels. The qRT-PCR results mainly were (12/15) consistent with the RNA-seq data, indicating the high reliability of the RNA profiles (Figure 3, Additional file 2).
Diagram of circular RNA junction site. Arrow direction indicates the direction of divergent primer design. (B)The PCR reactions with divergent and convergent primers using different templates showing the production of novel-circ-001061. (C)Sequencing confirmation of the junction site of novel-circ-001061.
CircRNAs analysis in response to P. brassicae
To explore the expression pattern of circRNA in response to the different pathotypes of P.brassicae in B.rapa, circRNA expression profiles of all the samples were compared by cluster analysis. The expression pattern of circRNA as divided into two different clusters based on infection time (Figure 4). At the later stage of inoculation, the circRNA response to two pathotypes of P.brassicae showed obviously differences. Although in the same cluster, there was significant difference between uninoculated samples and 8-days post inoculation samples, regardless of the pathotypes suggesting that the circRNA pattern of expression was affected by the inoculation time and pathotype of P.brassicae (Additional file 3). A total of 231 significantly differentially expressed circRNAs between the 8 comparisons were detected. There was no significant difference in the number of up/down-regulated circRNAs between the samples inoculated with PbE or Pb4 at 8 dpi (24 versus 27). With the inoculation time increasing, the number of DE circRNAs increased up to 118 in the samples inoculated with PbE, 76 circRNAs were up-regulated, and 42 circRNAs were down-regulated (Figure 4). However, the number of DE circRANs were only 64 inoculated with Pb4 at 23 dpi. The difference in the number of DE circRNAs implied that B. rapa was sensitive to the infection by virulent P. brassicae pathotype. Overall, the number of differentially expressed circRNAs infected with PbE was greater than that of circRNAs infected with Pb4. These results highlight the distinct responses of “BJN 222” to pathogens exhibiting varying degrees of virulence.
Figure 5, shows the result of data-set overlap, here compared with ck, specifically DE circRNAs were 18 and 15 in Pb4 and PbE at the time of 8 dpi, respectively. Furthermore, identified 22 and 76 specifically DE circRNAs at 23 dpi in samples inoculated with Pb4 and PbE, respectively (Figure 5). In addition, 25 circRNAs were co-expressed after inoculated with Pb4 or PbE at the comparison of 8 dpi and 23 dpi. And only 1 circRNA was find continuously expressed after inoculation with PbE at the two-time point (Figure 5). Whether inoculated with Pb4 or PbE, the number of specifically DE circRNAs at 23 dpi is much larger than that at 8 dpi, suggesting that B.rapa responded differently to a different inoculation time. The specifically expressed circRNAs in plants inoculated with PbE were 4 fold than that of Pb4 at 23dpi, suggesting that B.rapa responded differently to different P. brassicae pathotype. Together, these results suggest that circRNAs participate in transcriptome-wide molecular landscapes of B. rapa responses to the P. brassicae stress.
Identification of circRNA parental genes
The annotated genes producing circRNAs are referred to as the parental genes of the circRNAs, while the circRNAs without parental genes were described as ‘NA’. Here a total of 1435 of the identified 1636 circRNAs originated from 1004 parental genes, and 201 intergenic-type circRNAs originated from the fragment between two genes without parental genes (Additional file 4). A total of 826 circRNAs have only one parental gene, and 609 circRNA originated from more than one parental gene. The parental genes produced different circRNAs due to the alternative splicing pattern in B.rapa, which is consistent with previous studies that suggest circRNAs possess an alternative splicing pattern, making it a valuable resource for understanding the complexity of circRNA biogenesis and their potential functions of circRNAs (Zhang et al., 2016; Gao et al., 2016).
CircRNA parental genes functions analysis
The circRNAs play significant roles in transcriptional control by cis-regulation of their host genes (Li et al., 2015). In order to gain insight into the potential circRNAs mediated mechanism of B.rapa response to the infection of P. brassicae, the KEGG pathway enrichment analysis method was employed to analyze the DE circRNA parental genes. The comparison between avirulent Pb4 and control group at 8 dpi showed that the DEGs were enriched in “Sulfur metabolism” (ko00920), “Microbial metabolism in diverse environments” (ko01120), “Citrate cycle (TCA cycle)” (ko00020) (p<0.05) (Figure 6A, Additional file 5). However, no significant pathway enrichment was detected at 23dpi between the two groups. We also compared the virulent PbE and control group at 8 dpi, and one significant pathway enrichment was detected i.e., “Sulfur metabolism” (ko00920) (p<0.05), while the pathway “Vitamin B6 metabolism” (ko00750) were enriched at 23 dpi (Figure 6C, Figure 6D, Table S2). The pathway like “Phenylalanine, tyrosine and tryptophan biosynthesis” (ko00400) was significantly enriched in the comparison of PbE vs. Pb4 at 8 dpi and 23 dpi (Figure 6E, Figure 6F). Tryptophan biosynthesis was involved in SA and camalexin biosynthesis. The plant hormone salicylic acid (SA) plays a critical role in defense against biothrophic pathogens such as Plasmodiophora brassicae (Miao et al., 2019); and camalexin is a sulfur-containing tryptophan-derived secondary metabolite, reported to play defensive functions against several pathogens in Arabidopsis (Ausubel et al., 1995; Glawischnig, 2007).
GO (Gene Ontology) analysis was performed on the parental genes to understand the biological function of the DE circRNAs. The parental genes were classified into three GO categories: biological process, molecular function, and cellular component. The top five subcategories in the biological processes class such as “cellular process”, “single-organism process”, “metabolic process”, “response to stimulus”, and “localization” were enriched (Additional file 6, Additional file 7), implying that the parental genes of circRNAs may function in response to the infection of P.brassicae. The most highly represented molecular function categories were “catalytic activity” and “binding”. The top 3 subcategories in the cellular component class, were “cell”, “cell part”, and “organelle”. Similarly KEGG pathway enrichment analysis results, “metabolic process” and “response to stimulus” were overrepresented at every in all the comparison groups, and the number of DEGs was significantly higher at 23 dpi than 8 dpi.
Together, many parental genes responded to rhizobium stress. Also, the amount of DE circRNA varied considerably over time, but there were most pathways of enrichment associated with metabolic processes, which may be due to plant life activities rather than stimulation of P. brassicae. Therefore, we focus on the circRNAs and related genes of significant enrichment pathways at the early stage infected with P. brassicae (Table 3). Furthermore, the parental genes and their annotations in each comparison was shown as Additional file 8. By predicting the function of parental genes, the circRNAs origined from Bra012389 and Bra019293 were selected for further study.
Table 3. The circRNAs and their parental genes in each comparison
Group
|
CircRNA ID
|
Soure gene ID
|
log2(FC)
|
P value
|
chr
|
strand
|
Type
|
ck-0d-vs-Pb4-8d
|
novel_circ_000064
|
Bra013579*
|
4.246827502
|
0.004744418
|
A01
|
-
|
exon_intron
|
novel_circ_000079
|
Bra013579*
|
2.893516888
|
0.009588829
|
A01
|
-
|
exon_intron
|
novel_circ_000086
|
Bra013579*
|
-2.84065774
|
0.028360819
|
A01
|
+
|
antisense
|
novel_circ_000264
|
Bra039746*
|
-19.12803312
|
0.04701868
|
A02
|
+
|
exon_intron
|
ck-0d-vs-PbE-8d
|
novel_circ_000074
|
Bra013579*
|
19.51979677
|
0.011219066
|
A01
|
-
|
exon_intron
|
novel_circ_000086
|
Bra013579*
|
-3.947713015
|
0.012980797
|
A01
|
+
|
antisense
|
PbE-8d-vs-Pb4-8d
|
novel_circ_001061
|
Bra012389*
|
-4.222569626
|
0.038710775
|
A07
|
-
|
exon_intron
|
novel_circ_000495
|
Bra019293*
|
4.995712702
|
0.001038162
|
A03
|
-
|
antisense
|
Note: *indicate that it is a significant difference at 0.05 level.
CircRNAs acting as miRNA sponges
To determine whether circRNAs in B.rapa could affect post-transcriptional regulation by binding to miRNAs and preventing them from regulating their target mRNAs, we identified miRNA target sites of circRNAs in B.rapa, and found that 257 of 1636 (15.7%) circRNAs had putative miRNA-binding sites (Additional file 9). Of this 257 circRNAs, 116 had more than one different miRNA-binding site, and the greatest number of miRNA-binding sites (39) was found in novel_circ_000502. Like novel_circ_000769 only had one miRNA-binding site, and novel_circ_001061 and novel_circ_000495 had 2, 3 miRNA-binding sites, respectively (Figure 7). The miR5021-x and miR2275-x account for most of the circRNAs (55, 38, respectively). These results indicated that circRNAs in B.rapa have many potential miRNA binding sites and probably affected the expression of disease-resistant genes through miRNA.
Schema diagrams show the pairing of each circRNA sequence and the sequence of its target region.
Construction of circRNA-miRNA-mRNA network
To better understand the gene regulatory network during the infection of Brassica rapa, the DE circRNAs and their parental genes in the PbE-8d-vs-Pb4-8d group were selected for the circRNA-miRNA-mRNA network analysis with Cytoscape software (Additional file 10, Additional file 11) (Liang et al., 2019). By predicting the miRNAs and their target genes of circRNAs, many miRNA-targeted mRNAs were annotated to be stress-associated. 20 target genes were annotated with “receptor-like protein kinase”, “zinc finger protein”, “LRR-repeat protein” and “hormone-related” functions (Additional file 12). The relative expression level of these candidate target genes showed diverse expression patterns (Figure 8). Here, 5 target genes, Bra011339 (zinc finger protein), Bra013568 (zinc finger protein), Bra026508 (hormone-related), Bra036269 (LRR-repeat protein) were consistent with the expression pattern of novel_circ_000495 at 8 dpi (Figure 9A). However, only Bra026508 showed significantly different expression between the inoculation of Pb4 and PbE (Figure 9B). Therefore, the circRNA novel_circ_000495 probably plays an important role in clubroot resistance through post-transcriptional control of its target genes which are involved in biotic stress. Furthermore, more results are needed to confirm the function Bra026508 in the P. brassicae resistance mechanism.