Transcriptome assembly and LncRNAs identification
To profile the LncRNA transcripts in response to salt stress, we performed the time-series dynamic analysis of RNA-sequencing on quinoa roots exposed to high salinity conditions (Table S1). In total, 464 million raw reads were generated, and 430 million clean reads were obtained after cleaning. Approximately 85.30% (367 million) of the clean reads were mapped to the quinoa reference genome and assembled into transcripts. Cleaned data (~30 Gb) was first mapped to the reference genome, and then alignments were assembled into transcripts in each sample (Table S2). Transcripts from all samples were merged to form a unified set of transcripts, which consisted of 188,663 genes and 234,387 transcripts. There were 146,440 transcripts (62.5% of the predicted transcripts) with only one exon. There were 1.24 transcripts per gene and 3.15 exons in a transcript.
Coding potential ability was analyzed by CPC2, from which there were 153,751 noncoding LncRNAs identified (Table S3). A genome location study showed these LncRNAs were intensely distributed in 18 chromosomes of C. quinoa, and the distribution intensity in the two ends of each chromosome was higher than in other parts (Figure 1A). According to the relationship between a transcript and the closest reference transcript, LncRNAs were grouped into different classes represented by characters and symbols by GffCompare software (Figure 1B). “u” was the most abundant type (72.8%), showing that most LncRNAs came from the intergenic region. The second most abundant class was the “x” type (12.1%), followed by “i” (6.7%) and “p”(3.3%). These results indicated that LncRNAs seldom overlapped with reference transcripts. As shown in Figure 1C and Figure 1D, the length of LncRNAs was generally much shorter than the coding RNAs, and LncRNA had less exon contained than the coding ones. The difference in expression pattern between coding RNAs and LncRNAs was measured statistically using a two-tailed Mann-Whitney U-Test (Figure 1E). The expression pattern of LncRNAs was significantly different from that of the coding genes. The overall expression level of LncRNA was significantly lower than that of coding RNAs at 0h, 0.5h and 2h of -salinity treatment (p < 2.2e-16). However, at 24h of treatment, the overall expression level of LncRNAs increased sharply and was more significant than that of coding RNAs. The expression level of coding RNAs was relatively stable within 24 hours of treatment.
Identification of differentially expressed LncRNAs in response to salt stress
Differentially expressed transcripts, including coding RNAs and LncRNAs, were then identified by DESeq2 (Table S4). We identified 4,460 DE-LncRNAs, of which 214, 1731 and 3,102 were identified at 0.5h, 2h and 24h of treatment (Figure 2A). Only 54 LncRNAs were differentially expressed at all the three-time points, occupying 1.2% of total DE-LncRNAs (4460) (Figure 2B). Totally 6,791 DE-coding RNAs were also identified, 104 (1.5%) differentially expressed at three-time points (Figure 2B). At 0.5h of treatment 75% (161) DE-LncRNAs and 77% (237) DE-coding RNAs were upregulated, while at 2h of treatment 48% (831) DE-LncRNAs and 56% (2197) DE-coding RNAs were upregulated. At 24h of treatment, as high as 81% (2,513) DE-LncRNAs were upregulated, whereas only 39% (1,462) DE-coding RNAs were upregulated (Figure 2C and 2D, Table S2).
Construction of gene co-expression network and analysis of salinity responsive modules
To infer the potential biological functions of the LncRNAs, a weighted gene co-expression network consisting of both LncRNAs and coding RNAs based on expression profiles was constructed by the WGCNA program [41]. The soft-thresholding power was predicted and defined as 7, as it was the lowest power for which the scale-free topology fit index reached 0.90 (Figure 3A, 3B). There were finally 36 modules (Figure 3C) generated, and they were named from M1 to M36. The relationship between modules and salinity treatment was calculated, of which seven modules were highly relevant to salinity treatment (Figure 3D). M30 (r = 0.93, p = 1.57E-05) and M12 (r = 0.99, p = 9.12E-10) were upregulated significantly at 0.5h and 2h respectively; M17 (r = 0.90, p = 5.48E-05) and M32 (r = -0.99, p = 4.71E-09) modules upregulated and downregulated at 24h, respectively; M13 (r = 0.90, p = 5.53E-05) upregulated at 0.5 and 2h both; M11 (r = 0.90, p = 6.35E-05) upregulated at both 2h and 24h, while M33 (r = 0.90, p = 6.35E-05) downregulated at the same time. No module was significantly regulated at all the three-time points. The percentage of LncRNAs in salinity responsive modules ranged from 20% to 40%. The genes with the highest connectivity in each module were selected as hub genes. Totally, 210 hub genes were identified from these seven salinity responsive modules listed above, which constituted a subnetwork (Figure 3E, Table S6). The hub genes of M30 included both TF genes and LncRNAs, and so did M13, M12 and M11 modules. This implied that LncRNAs within these modules might interact with transcript factors and their role in salinity response. However, TF genes were not included in the hub genes of M17 and M32 modules. Instead, more than half of the hub genes were LncRNAs in them. In M17 module as high as 23 LncRNAs were at the hub position(Table S6).
GO enrichment analysis of salinity-responsive modules
Gene ontology analysis of coding RNAs was performed to predict the function of the LncRNAs within the same module (Table S5). The most representative GO terms of high-salinity responsive modules are shown in Figure 4. A large part of the GO terms was enriched in all the seven high-salinity responsive modules. They were related to various critical biological processes, including biological regulation (GO:0065007), regulation of gene expression (GO:0010468), response to stress (GO:0006950), regulation of metabolic process (GO:0019222), transcription (GO:0006350) and developmental process (GO:0032502). Transport (GO:0006810) and metabolic process (GO:0008152) were enriched in six salinity responsive modules. A small part of the GO terms was enriched in only two or three modules such as cell cycle (GO:0007049). Within these modules, there were several genes directly responding to salt stress. For instance, in M11, a transcript encoding a salt tolerance zinc finger (C2H2 type) which is highly homologous to Arabidopsis gene (AT1G27730.1); in M12, there is a transcript encoding calcineurin B-like protein 1, which is highly homologous to Arabidopsis gene (AT4G17615.1), it might function as a positive regulator of salt and drought responses and as a negative regulator of cold response, and mediates the activation of AKT1 by CIPK proteins (CIPK6, CIPK16, and CIPK23) in response to low potassium conditions; In M13, one transcript homologous to Arabidopsis gene AT5G12010 was included, which might respond to salt stress and function in ABA-activated signaling pathway.