Transcriptome, sRNA, and degradome sequencing
After 30 days of growth, all IBA-treated groups (100–500 mg/L IBA) had developed ARs, while only a small amount of AR development was observed in the control group (Fig. 1). Importantly, substantially more ARs were developed by the cuttings treated with 300 mg/L IBA; the ARs of these cuttings were also of more consistent length and more densely packed than those formed on the cuttings treated with other concentrations of IBA (Fig. 1). Cuttings treated with 300 mg/L IBA treatment had significantly more roots than any other cuttings. Therefore, the ARs from the cuttings treated with 300 mg/L IBA (henceforth referred to as IBA300), as well as those formed by the control cuttings, were selected for transcriptome profiling.
Raw data generated by Illumina HiSeq sequencing were: 200.4 million raw reads from the control samples and 213.29 million raw reads from the IBA300 samples. After quality filtration, 55.48 G of clean data (Q30≥93.69%) remained. The percent of GC content was greater than 38.16%, the AT and GC ratios were within a reasonable range, and there was no separation. We identified 82,468 DEGs (fold-change≥2, FDR<0.01) between the control and IBA300. Of these, 69,777 were upregulated and 12,691 were downregulated. Across all DEGs, 31,385 CDS were detected, and 1,417 Unigene-encoding transcription factors were predicted. The sequencing data were submitted to the SRA database with the accession numbers SRR13808891, SRR13808890, SRR13808889 and SRR13808888.
In total, approximate 51.22-52.42 million high-quality clean reads from the control and IBA300 groups were obtained using sRNA-seq. Most of the sRNAs were 21–24 nt long, with 24 nt sequences being the most common across all samples (Fig. S1). The reads remaining after the removal of tRNAs, rRNAs, snRNAs, snoRNAs, and degraded mRNAs were considered endogenous sRNAs and used in subsequent analyses. The sequencing data were submitted to the SRA database with the accession numbers SRR13808887, SRR13808886, SRR13808885 and SRR13808884.
Analysis of sRNA-seq data showed that 48 known (Table S1) and 95 novel miRNAs were identified. The length of mature miRNA varies from 20 to 30nt. Due to the recognition and cleavage of DCL1 enzyme, the first base at the 5' end of each miRNA was biased to U and resistant to G. Across all datasets, the first nucleotide of the novel miRNA was primarily biased towards U, followed by A (Fig. S2). 1,744 target genes cleaved by known miRNAs and 1,956 miRNA-mRNA pairs were predicted using Targetfinder.
To generate a miRNA-cleaved target library (degradome) from A. rubrum, we first identified the mRNA transcripts targeted by miRNAs in the total RNA samples using high-throughput sequencing. We obtained 34,373,253 short reads from the control RNA samples and 28,061,482 short reads from the IBA300 RNA samples, representing the 5’ ends of uncapped, poly-adenylated RNAs. After initial processing, equal numbers of 20 and 21nt sequence reads remained. In total, 47.66% of the unique reads were successfully mapped to the A. rubrum transcriptome. We identified 172 mRNAs targeted by the known miRNAs and 243 miRNA-mRNA pairs via CleaveLand (Table S2). The sequencing data were submitted to the SRA database with the accession numbers SRR13808883 and SRR13808882. Across the 172 mRNAs targeted by the known miRNAs, most were associated with plant hormone signal transduction. Of these, 80 mRNAs were either transcription factors or related to transcription factors, including growth-regulating factors (GRFs), ARFs, SQUAMOSA promoter-binding protein-like (SPL) proteins, MYBs, and ethylene response factors (AP2/ERF). Consistent with the RNA-seq results, many genes were associated with the plant hormone signal transduction pathway. However, only 8.8% of the target genes were identified by degradome sequencing. This suggested that most of the predicting miRNA target genes were false positives.
The T-Plot of five conserved miRNAs and one predicted novel miRNA was shown in Fig. 2. Three of the validated differentially expressed miRNAs (Ar-miR160a-5, Ar-miR171d-1, and Ar-miR156f) had previously been shown to target transcription factors.
Pathways and functions associated with the DEGs in A. rubrum
Among all the DEGs, GO analysis indicated that 14,451 DEGs were significantly enriched in 30 GO terms between the control and IBA300 groups (27% cellular component terms, 23% molecular function terms, and 50% biological process terms; Fig. S3). In particular, the biological process terms “regulation of biological process”, “biological regulation”, “cellular process”, “metabolic process”, “signaling”, and “response to stimulus”; the cellular component terms “macromolecular complex”, “cell junction”, “nucleoid”, “organelle”, “cell part”, and “membrane part”; and the molecular function terms “binding”, “catalytic activity”, “transcription regulator activity”, and “signal transducer activity” were overrepresented in the DEGs (Fig. S3).
Of the 14,451 DEGs, 9,507 were successfully mapped to the KEGG database (P-value < 0.05). KEGG mapping indicated that 59 terms and 12 metabolic pathways were significantly enriched in these DEGs, with the largest numbers of DEGs associated with the MAPK signaling pathway and the plant hormone signal transduction pathways (Fig. S4). The log-normalized FPKM values for the significantly enriched metabolic pathways were used for hierarchical clustering analysis (Fig. 3). The significantly upregulated pathways were ubiquitin mediated proteolysis and hormone signal transduction (Table S4).
Many of the DEGs associated with the plant hormone signaling pathway were involved in auxin signaling transduction. For example, numerous genes in the families associated with auxin signaling (e.g., AUX, IAA, GH3, ARF, and SAUR) were significantly differentially expressed between the control group and the IBA300 group, including Aux/IAA5 (Unigene3732_All), SAUR11 (Unigene73966_All), Aux/IAA4 (Unigene5767_All), and ARF18-1 (Unigene5190_All). The expression of 24 genes (i.e., the IAAs, GH3s, ARFs, and SAURs) were upregulated in IBA300 as compared to the control (Fig. 4a). Our results indicated that 42 DEGs were involved in the ubiquitin-mediated proteolysis pathway (Fig. 4b). Some of these DEGs had been shown to be associated with the mediation of Aux/IAA protein ubiquitination, including S phase kinase-associated protein 1 (SKP1) (Unigene7049_All), cullin 1 (CUL1) (Unigene20934_All), and F-box (SCF) E3 (Unigene17129_All), were significantly upregulated. The patterns of upregulation we observed suggested that the plant hormone signaling pathway and the ubiquitin mediated proteolysis pathway jointly regulate A. rubrum root development.
We also performed GO and KEGG analyses of the miRNA target genes. We found that the GO term “DNA binding pathway (GO: 0003677)” was significantly enriched in the target genes (p = 0.003258). KEGG pathway analysis also showed that 10 of the targets of the 220 differentially expressed miRNAs that played a significant regulatory role in the plant hormone signaling pathway belonged to three gene families (ARF, SCL and TGA family) (Fig. S5).
qRT‑PCR validation of the miRNAs and their targets
Nine miRNAs were selected for qRT-PCR validation. All nine miRNAs were significantly downregulated in the IBA300-treated samples as compared to the control (Fig. S6), which was consistent with our sequencing data. Thus, these nine miRNAs might play important regulatory roles in A. rubrum growth, especially during rooting.
Next, to reveal the expression patterns of miRNA-mRNA pairs associated with plant hormone signal transduction in A. rubrum roots, we quantified the expression levels of six miRNAs and six target genes that were verified to interact by degradome sequencing, and where the target gene was also differentially expressed in the hormone transduction pathway. After IBA treatment, these six miRNAs showed inverse expression patters to the corresponding six target genes (Fig. 5). This was consistent with our RNA and degradome sequencing results. Therefore, we speculated that the Ar-miR160-ArARF10 interaction might be critical for the regulation of AR development in A. rubrum. The primers used in all quantitative RT-PCR experiments were listed in Table S3.
Comparative phylogenetic analysis of the ArARF protein family
Putative ARF genes in A. rubrum were identified in the RNA-seq data using BLAST. After the removal of redundant sequences and alternative splice forms, 18 ARF proteins were identified as potentially encoded by the ArARF genes (Table S5). In this study, we designated these proteins ArARF1–ArARF10 and ArARF16–ArARF19. Because the ArARF10 and ArARF19 proteins were encoded by more than one homologous gene, we named the encoding genes as ArARF1–ArARF10, ArARF16–ArARF19, ArARF19-2, and ArARF19-3 (Table S5). The ORFs of the ArARF genes varied from 1,836 bp (ArARF10) to 3,510 bp (ArARF19-3). These genes encoded polypeptides of 612–1,170 amino acids, with predicted molecular masses of 67.17–131.19 kDa and theoretical pIs of 5.30–8.39 (Table S5).
Our Neighbor-joining (NJ) phylogenetic tree, which included 18 ArARFs, 22 AtARFs, 16 DiARFs, and 18 CsARFs, suggested that ArARF homologs were more common in Dimocarpus longan and citrus reticulate than in A. thaliana (Fig. S7A). The ARF genes were clustered into three major, well-supported clades (I–Ⅲ; BS>74; Fig. S7A). Seven ArARF genes fell into clade Ia, four into clade Ib, one into clade IIa, three into clade IIb, and two into class III (Fig. S7A, highlighted in red). Alignment of the ARF proteins indicated that most of the ArARF proteins harbored three characteristic regions (Fig. S7C). Unsurprisingly, most of the close homologs in our phylogenetic tree shared common motifs. The three domains of ArARF proteins were comprised of a total of eight different motifs: motif 1, 2, and 3 constituted the DNA-binding domains; motif 6, 8, and 9 constituted the ARF domain; and motif 5 and 10 constituted the C-terminal Aux/IAA domains. Motif 1, 2, 3, 6, 8, and 9 were found in all 18 ArARF proteins (Fig. S7B).
Construction of three miRNA-mRNA regulatory networks
The minimum free energy (MFE) values for the precursors of our three focal miRNAs (Ar-miR160a-5p, Ar-miR171d-1, and Ar-miR156f) were −87.10 kcal/mol, −72.90 kcal/mol, and −34.70 kcal/mol, respectively. The precursor sequences had typical stem-loop structures (Fig. S8). High-throughput degradome sequencing indicated that Ar-miR160a targeted ArARF10 (CL3897.Contig1_All) and ArARF18-1 (Unigene5190_All); Ar-miR171d-1 targeted ArSCL6-1 (Unigene23148_All) and ArSCL6-2 (CL2444.Contig4_All); and Ar-miR156f targeted ArAPL-2 (CL2023.Contig1_All) (Fig. 6).
Identification of ArARF10
Expression of fluorescent-tagged ArARF10 in N. benthamiana seedlings showed that, while the green fluorescent protein (GFP) control was dispersed throughout the cell (Fig. 7A-C), GFP-tagged ArARF10 proteins were located in the nucleus (Fig. 7D-F), consistent with their putative function as transcription factors. In addition, ArARF10 fused to the GAL4 DNA-binding domain activated the expression of the His-3 reporter gene in yeast (Fig. S9), indicating that this gene was a transcriptional activator.
Overexpression of Ar-miR160a and ArARF10 in A. thaliana
Transgenic plants overexpressing ArARF10 had much more ARs than WT plants (Fig. 8 A, B, E, F), while transgenic plants overexpressing Ar-miR160a had fewer roots than WT plants (Fig. 8A, C-F). GUS staining indicated that the transgenic A. thaliana seedlings successfully overexpressed Ar-miR160a or ArARF10 (Fig. S10). Consistent with this, Ar-miR160a and ArARF10 were significantly more upregulated in the transgenic lines than in the WT lines (A. thaliana Columbia-0) (Fig. 8G). However, none of these genes was significantly differentially expressed in the transgenic plants except AtGH3.6, which was significantly upregulated in the transgenic plant overexpressing ArARF10 as compared to the WT (Fig. 8H).