Identification and characterization of lncRNAs
Based on the Illumina HiSeq 2500 platform, a minimum of 89,496,872 raw reads were obtained from each library, with a clean base ranging from 12.75 Gb to 16.66 Gb and an error rate of 0.01 or 0.02 (Table S1). To generate a complete annotation of the noncoding transcriptome of the Gushi chicken breast muscle tissue beyond the currently annotated transcriptome, we first used Cuffmerge [20] to combine and then screen the transcripts from each sample. In this study, a total of 20,438 transcripts were identified, 16,342 of which were mRNAs. In addition, a total of 1,376 lncRNAs had been previously annotated, 1,252 novel lncRNAs were identified (Figure 1A, B), and 1,468 transcripts of uncertain coding potential (TUCPs) were screened. Only two types of lncRNAs were identified: an overwhelming majority of long intergenic noncoding RNAs (lincRNA) (79.7%) and a minority of antisense lncRNAs (20.3%) (Figure 1C). LncRNAs in breast muscle tissue had a lower total transcript length, fewer exons, and fewer open reading frame (ORF) numbers than mRNAs (Figure 1D-F) and a lower average transcript abundance (Figure 1G).
Characteristics of differentially expressed lncRNAs
To gain insight into the key lncRNAs involved in chicken breast muscle development, we analyzed the differentially expressed lncRNAs (DE-lncRNAs) (|fold change, FC|≥1.7, q-value < 0.05) at four different developmental stages, namely, 6 weeks (W6), 14 weeks (W14), 22 weeks (W22) and 30 weeks (W30). Among the six different comparison groups, there were 53, 61, 50, 153, 117, and 78 DE-lncRNAs in the W14 vs. W6, W22 vs. W14, W22 vs. W6, W30 vs. W6, W30 vs. W14, and W30 vs. W22 comparison groups, respectively. Venn diagram analysis showed that there were no common DE-lncRNAs among the six comparison groups (Figure 2). Only LNC_000920 was commonly found in the following five comparison groups: W14 vs. W6, W22 vs. W14, W30 vs. W6, W30 vs. W14, and W30 vs. W22. Moreover, LNC_000255 appeared in the following comparison groups: W14 vs. W6, W22 vs. W14, W22 vs. W6, W30 vs. W6, and W30 vs. W14. Then, the DE-lncRNAs were identified by a DEGseq (differentially expressed gene, DEG) analysis, and DE-lncRNAs were clustered based on their expression profiles (Figure S1). The clustering results showed that the intragroup repeats of each group were clustered together, indicating that the intragroup differences were smaller than the intergroup differences, which proved that the data were reliable. In addition, we found that 22 weeks clustered close to 14 weeks, followed by 6 weeks, and the farthest distance was from 30 weeks. In addition, we also found that the number of common DE-lncRNAs between W22 and W14 was the lowest (Figure 2). It is possible that from 14 weeks - 22 weeks, many of the same lncRNAs played a common role, which also led to the reduction in DE-lncRNA that was commonly seen in the six comparison groups. Moreover, after identifying DE-lncRNAs, we analyzed the chromosome distribution information of the DE-lncRNAs and found that DE-lncRNAs were distributed in almost all chromosomes but not in chromosomes 22 and 27, and the largest number was found in chromosome 1 (Figure S2). In addition, we selected several lncRNAs for data validation. The lncRNA expression level was determined and showed a similar pattern to that of the RNA-seq data (Figure 3), indicating that the RNA-seq data were authentic.
To investigate the possible functions of the DE-lncRNAs in breast muscle between the different developmental stages, we conducted Gene Ontology (GO, http://www.geneontology.org/) enrichment analysis to uncover the enriched biological process terms associated with DE-lncRNA-targeted DEGs for each comparison group. Only the top 20 GO terms for the W14 vs. W6, W22 vs. W14, and W30 vs. W22 comparison groups are shown in Figure 4 and Figure S3. The cis-targets of all lncRNAs were predicted with a 100-kb upstream and downstream range. The GO enrichment analysis of the cis-targets of lncRNAs showed that only one growth- and development-related GO term, called positive regulation of embryonic development, was found in the W22 vs. W14 comparison group (Figure 4A). In addition, we predicted the regulation in trans between lncRNAs and genes by the Pearson correlation coefficient r > 0.95. In the GO analysis of the trans-targets of lncRNAs, we found several muscle development-related GO terms only in the W22 vs. W14 comparison group (Figure 4B), such as positive regulation of skeletal muscle tissue development, positive regulation of striated muscle tissue development, positive regulation of muscle organ development, positive regulation of muscle tissue development, and positive regulation of striated muscle cell differentiation.
To further understand how DE-lncRNA-targeted DEGs play roles in regulating chicken muscle development, we performed Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/) pathway analysis for each comparison. In the KEGG pathway analysis of the cis-targets of lncRNAs, the phagosome pathway was identified as the most significantly enriched pathway for the W14 vs. W6 comparison group (Figure S4A). Furthermore, for the W22 vs. W14 comparison group, the endocytosis pathway was identified as the most significantly enriched pathway (Figure S4B). Additionally, the focal adhesion and cytokine-cytokine receptor interaction pathways were the top two pathways for the W30 vs. W22 comparison group (Figure S4C). Moreover, the KEGG pathway analysis of the trans-targets of lncRNAs showed that the propanoate metabolism pathway and fatty acid metabolism pathway were the top two pathways for the W14 vs. W6 comparison group (Figure S4D). In the W22 vs. W14 comparison group, the two top pathways were arginine and proline metabolism and the MAPK signaling pathway (Figure 5A). In addition, for the W30 vs. W22 comparison group, there were two top pathway terms, the MAPK signaling pathway and the regulation of actin cytoskeleton pathway (Figure 5B). We found that the MAPK signaling pathway was one of the most frequently enriched pathways in both the W22 vs. W14 and W30 vs. W22 comparison groups.
Interactions between lncRNAs and mRNAs during breast muscle development
To explore how lncRNAs interact with their target genes to regulate chicken muscle development and to identify key molecular players in the process, we first predicted the cis- and trans-targets of DE-lncRNAs and then constructed the regulatory networks between DE-lncRNAs and their target genes. A total of 13,460 cis-regulatory interaction relationships were detected between 2,309 lncRNAs and 7,783 mRNAs (Table S2). In addition, 13,343 trans-regulatory interaction relationships were detected between 733 lncRNAs and 2,190 mRNAs (Table S3). Moreover, we constructed the lncRNA-mRNA networks of cis- and trans-targets for muscle development related to the top 20 GO terms, including positive regulation of embryonic development, positive regulation of skeletal muscle tissue development, positive regulation of striated muscle tissue development, positive regulation of muscle organ development, positive regulation of muscle tissue development and positive regulation of striated muscle cell differentiation. In the networks of cis-target DEGs of DE-lncRNAs of the muscle development-related GO terms, we found a total of 10 interaction relationships between 3 genes and 10 lncRNAs (Figure 6A). In addition, in the networks of trans-target DEGs of DE-lncRNAs of the muscle development-related GO terms, we found 7 interaction networks between 3 genes and 7 lncRNAs (Figure 6B). Furthermore, we also generated the lncRNA-mRNA networks of the frequently enriched MAPK signaling pathway, which had a total of 25 interaction relationships between 11 genes and 25 cis-regulating lncRNAs and 27 interaction relationships between 8 genes and 17 trans-regulating lncRNAs (Figure 6C). Interestingly, we found that the networks containing MEF2C and its targeting lncRNAs (ALDBGALT0000008862, ALDBGALT0000008865, LNC_001247) were not only in the muscle development-related GO terms but also in the MAPK signaling pathway.
LncRNA-miRNA interactions
In our previous study [19], we found 388 known miRNAs and 31 novel miRNAs. To explore the interactions between lncRNAs and miRNAs, we predicted the target relationship between lncRNAs and miRNAs in different comparison groups (Table S4). Twelve common target DE-miRNAs of DE-lncRNAs were found in different comparison groups. It is important that some of them were muscle-specific miRNAs, such as gga-miR-206, gga-miR-1a-3p, and miR-133a-3p. Only the lncRNAs (FPKM>1) of these miRNA targets are shown in Figure 7. Then, we predicted the pre-miRNAs with homology to lncRNAs. Unexpectedly, the precursors of four newly identified miRNAs were found to be homologous to lncRNAs, and the precursors were temporarily named gga-miR-N1, gga-miR-N2, gga-miR-N3 and gga-miR-N4 (Figure 8, Table S5). For example, the pre-miRNA of gga-miR-N1 exactly matches ALDBGALT0000008009 at its position from 309 to 374, and the pre-miRNA of gga-miR-N2 exactly matches lnc_000010 at its position from 1527 to 1587, and it also matches lnc_000011 from 1101 to 1161. These lncRNAs may form miRNA precursors through intracellular shearing and then could be processed to generate specific miRNAs that regulate the expression of target genes.
LncRNA-miRNA-mRNA regulatory networks
To identify potential ceRNA networks in the development of chicken breast muscle, we constructed ceRNA networks of DEGs, differentially expressed miRNAs (DEMs), and DE-lncRNAs (q-value <0.05 and log2|FC|≧1) by Cytoscape (Figure S5), and we found some ceRNA networks associated with muscle development-related GO terms (Figure 9). For example, 445 ceRNA networks were found in the lncRNA-miRNA-mRNA network in the W14 vs. W6 comparison group (Figure S5A). Among these networks, ankyrin repeat domain 1 (ANKRD1) is related to skeletal muscle cell differentiation, and it was involved in 13 ceRNA networks containing two miRNAs (miR-148a-3p and miR-10b-5p) and 12 lncRNAs (LNC_000846, ALDBGALT0000001052, LNC_000453, LNC_001182, ALDBGALT0000006695, ALDBGALT0000002546, ALDBGALT0000006376, LNC_000255, LNC_000938, ALDBGALT0000006015, LNC_001012, and ALDBGALT0000000480) (Figure 9A). In addition, dynein light chain 2 (DYNLL2) is related to the myosin complex GO term, which was involved in 19 ceRNA networks with two miRNAs (gga-miR-148a-3p and gga-miR-130b-3p) and 12 lncRNAs (LNC_000846, ALDBGALT0000001052, LNC_000453, LNC_001182, ALDBGALT0000006695, ALDBGALT0000002546, ALDBGALT0000006376, LNC_000255, LNC_001012, LNC_000938, ALDBGALT0000006015, and ALDBGALT0000003517) (Figure 9A). In the W22 vs. W14 comparison group, there were 76 ceRNA networks (Figure S5B). Among them, myosin heavy polypeptide 11 (MYH11) is related to muscle cell differentiation, which was involved in 8 ceRNA networks containing gga-miR-194 and 8 lncRNAs (LNC_000668, LNC_000569, LNC_001009, ALDBGALT0000000938, LNC_001086, LNC_000373, LNC_000920, and LNC_001140) (Figure 9B). Moreover, there were 450 ceRNA networks in the W30 vs. W22 comparison group (Figure S5C). The skeletal muscle fiber development-related gene regulators of calcineurin 1 (RCAN1) and ANKRD1 were involved in 8 ceRNA networks containing gga-miR-92-3p and 8 lncRNAs (LNC_000920, LNC_000704, ALDBGALT0000001001, ALDBGALT0000005521, LNC_000618, ALDBGALT0000000349, LNC_000204, and ALDBGALT0000003603) (Figure 9C).
Protein-protein interaction (PPI) network of DE-lncRNA target genes
The PPI network was constructed by Cytoscape software using the predicted protein-protein interaction networks from the STRING database (Figure 10). The PPI network from DE-lncRNA cis-target genes of the W14 vs. W6 comparison group contained 13 protein-protein pairs, such as IGF-I-EGF. Moreover, in the W22 vs. W14 comparison group, there were 6 protein-protein pairs, for example, FZD6-WNT11. Furthermore, the DEGs from the W30 vs. W22 comparison group included 23 protein-protein pairs, including AR-PPAR. However, no PPI network was found in the DE-lncRNA trans-target genes of the W14 vs. W6 and W22 vs. W14 comparison groups. In addition, the DE-lncRNA trans-targets from the W30 vs. W22 comparison group included 36 protein-protein pairs, including CDK8-CCNC.