Global identification of lncRNAs in Oryza sativa L
To globally identity lncRNAs associated with response to RBSDV and RSV infection in rice, we used RBSDV and RSV to inoculate rice respectively for 30 days and then performed RNA-seq using diseased rice. There were 9 cDNA libraries constructed from rice treated with RBSDV, RSV, and CK (control) (3 biological replicates for each sample). We performed RT-PCR and found that RBSDV and RSV were successfully colonized in corresponding samples (Additional file 1: Fig. S1; Additional file 5: Table S1). We used Illumina HiSeq 4000 platform to sequence all of libraries and obtained a total of 0.62 billion raw reads. There were 0.56 billion clean reads generated after removing adaptor sequences and low-quality reads. The average percentage of clean reads and GC content were 90.48% and 48.51%, respectively (Additional file 6: Table S2). Then, approximately 78.46%-96.08% of clean reads were mapped to Oryza sativa L. reference genome (IRGSP-1.0) by performing HISAT2 software (Additional file 7: Table S3). Total 46,020 transcripts (38,819 genes) including 24,893 known transcripts and 21,127 novel transcripts were generated. We calculated the Transcripts Per Million (TPM) represented the expression level of all transcripts by using Salmon software. We considered 41,992 transcripts (20,240 mRNAs) with TPM > 1 at least one sample to be expression (Additional file 8: Table S4). We used the novel transcripts with TPM > 1 at least one sample to identify lncRNAs by the filtering pipeline (Fig. 1a). The transcripts with length < 200bp, number of exon = 1, and ORF > 300bp were filtered. Subsequently, CPC software, and Swiss-Prot database were used to evaluate protein-coding potential. Finally, we obtained a total of 1,925 lncRNAs including 1,112 genic lncRNAs, and 813 intergenic lncRNAs (Fig. 1b; Additional file 9: Table S5). In these lncRNAs, there were 752, 418, 395, and 360 sense-genic, sense-intergenic, antisense-intergenic, and antisense-genic lncRNAs, respectively (Fig. 1c). Sequence lengths between 200 and 2000 bp of lncRNAs and mRNAs were most frequent and their percentages reached 72.49% and 70.75%, respectively (Fig. 1d).
Characterization of rice lncRNAs
To analysis the distribution on rice chromosomes, we performed the R package Circlize() for visualization of the location of lncRNAs and mRNAs on rice chromosomes. The result indicated that lncRNAs were evenly distributed on 12 rice chromosomes, while higher densities of mRNA was located in the chromosome “arms” of most rice chromosomes than pericentromeric regions (Fig. 2a). Expression analysis showed that the expression level of identified lncRNAs was lower than mRNAs in rice (Fig. 2b). The different distribution tendencies of exon number were observed and approximately 2,731 (14.43%) mRNAs and 770 (40%) lncRNAs were composed of two exons (Fig. 2c). AS events, a way to generate mature transcripts by excising introns, was globally identified by performing AStalavista software. We obtained total 23,898 AS events involved in 5,774 genes including 724 lncRNAs and 4,424 mRNAs (Fig. 2d; Additional file 10: Table S6). Subsequently, we customized a user-friendly program to identify the five major AS events, AA, AD, IR, ES, and MX. IR was the most abundant (6,486, 27.14%) of all AS events and AA was followed (5,028, 21.04%). In the lncRNAs derived from AS events, 269, 159, 136, 296, and 2 lncRNAs were generated from AA, AD, ES, IR, and MX events, respectively (Fig. 2d).
Predicting the target genes and calculating differentially expression of lncRNAs
To investigate the potential functional roles of identified rice lncRNAs, we predicted the cis target genes by search for mRNA within 100 Kb upstream and downstream sites of these lncRNAs. Finally, total 2,735 mRNAs associated with 2,383 genes may be regulated by identified lncRNAs (Additional file 11: Table S7). Compared with CK group, the differentially expression lncRNAs (DELs) of rice infected with RBSDV (RBSDV vs CK) and RSV (RSV vs CK) were identified based on fold change ≥ 2 and P-value < 0.05 by performing R package DEseq2. There were 344 DELs including 172 up-regulated and 172 down-regulated lncRNAs identified in RBSDV vs CK (Fig. 3a), 176 DELs including 61 up-regulated and 115 down-regulated lncRNAs identified in RSV vs CK (Fig. 3d; Additional file 12: Table S8). To further investigate the possible function of DELs in different groups, we performed KEGG pathway and GO term enrichment analysis of targets of DELs by using KOBAS and topGO, respectively (Additional file 13: Table S9). In RBSDV vs CK group, the DELs were associated with 78 KEGG pathways, including ‘Carbon metabolism’, ‘Plant hormone signal transduction’, and ‘Phenylpropanoid biosynthesis’ (Fig. 3b) and 272 GO terms of biological process (BP), including cellular process, biological regulation, and response to stimulus (Fig. 3c). In RSV vs CK group, we found that there were 8, 7, and 5 targets encoding ‘Spliceosome’, ‘Starch and sucrose metabolism’, and ‘Plant-pathogen interaction’, respectively, in enriched KEGG pathways (Fig. 3e) and for GO terms of biological process, major categories were cellular nitrogen compound metabolic process, cellular aromatic compound metabolic process, and organic cyclic compound metabolic process (Fig. 3f).
Characterization of share DELs
To further investigate the share lncRNAs associated with rice response to RBSDV and RSV, we performed distribution analysis of all DELs in both groups. The venn diagram showed that a total of 294 and 126 specific DELs of RBSDV vs CK and RSV vs CK, respectively, and 50 share DELs of both groups (Additional file 2: Fig. S2). We used R package TCseq to cluster the share DELs based on expression patterns and 4 clusters were generated (Fig. 4a). There were 7, and 15 up-regulated lncRNAs from cluster2, and cluster4, respectively, and 19 down-regulated lncRNAs from cluster3 in both comparisons. The targets of 22 up-regulated lncRNAs from cluster2, and 4, were subjected to KEGG pathway and GO term enrichment analysis and multiple KEGG pathways were significantly enriched (P < 0.05), such as ‘Sesquiterpenoid and triterpenoid biosynthesis’, ‘Carotenoid biosynthesis’, and ‘Proteasome’. Meanwhile, several GO terms associated with biosynthetic process were significantly enriched (P < 0.05), such as carbohydrate biosynthetic process, polysaccharide biosynthetic process, and glucan biosynthetic process (Fig. 4b).
Verifying accuracy of the RNA-seq by qRT-PCR
To verify the RNA-seq data, total 10 co-upregulated lncRNAs from cluster2 and 4 were randomly selected to perform qRT-PCR. The result showed that there were 8 significantly up-regulated lncRNAs in rice infected with RBSDV and RSV compared with CK (Fig. 5). The expression tendencies of most of selected lncRNAs were consistent with the sequencing results, indicating that the RNA-seq data was accuracy and the function of these lncRNAs in rice response to RBSDV and RSV were supported.
Construction of regulatory network
To globally identify transcription factors (TFs) in rice, we uploaded sequences of all identified transcripts into Plant Transcription Factor Database and finally obtained 1,535 transcripts encoding 55 TFs (Additional file 14: Table S10). In these TFs, bZIP and WRKY were related to most transcripts (112), followed by bHLH (107) (Additional file 3: Fig. S3). To understand the regulatory patterns between TF, lncRNA, and targets, the directed networks were constructed by using R package GENIE3. We first predicted 17,885 transcripts including 15,969 mRNA, and 1,916 lncRNAs targeted by TFs. Subsequently, we calculated the regulated relationship between the 15,969 mRNAs (target genes) and 1,916 lncRNAs (regulatory factors) by using GENIE3. Finally, total 1,145 co-upregulated transcripts including 60 genes encoding TFs (23 TFs) (regulatory factors), 21 lncRNAs (regulatory factors or targets), and 1,064 mRNAs (targets) were generated (Additional file 15: Table S11). In these transcripts, each lncRNA regulated by corresponding TF can regulate mRNA targeted by the TF, suggesting that the TF may regulated the mRNA via the corresponding lncRNA. The 1,064 mRNAs were subjected to KEGG pathway analysis and 10 pathways were significantly enriched (P < 0.05) (Additional file 4: Fig. S4; Additional file 16: Table S12). There were 17, 16, and 12 mRNAs associated with ‘Plant hormone signal transduction’, ‘Phenylpropanoid biosynthesis’, and ‘Plant-pathogen interaction’ (Additional file 4: Fig. S4; Additional file 16: Table S12). We used Cytoscape software to construct directed regulatory network between 60 TFs, 21 lncRNAs, and significantly enriched pathways associated (Fig. 6a). In the network, lncRNAs (OS_LNC0494, OS_LNC1144, OS_LNC1812) regulated by Os01t0730700-01 encoding WRKY14 may target ABA responsive element binding factor ABF (Os01t0867300-01) involved in ABA signaling pathway. Performing qRT-PCR showed that the expression levels of the three lncRNAs were significantly up-regulated in RBSDV and RSV infected samples (Fig. 6b). We measured the expression of WRKY14, ABF, and the three lncRNAs in rice treated with ABA for 12h and 24h and found that WRKY14, and ABF were significantly increased compared with control (Fig. 6c). In addition, OS_LNC1812 were also up-regulated in treated samples compared with control (Fig. 6c). The result implied that OS_LNC1812 regulated by WRKY14 may be associated with rice response to RBSDV and RSV infection via ABA signaling pathway.