High UV dosage does not degrade RNA but yields less RBPs
Though RIC has been successfully applied to Arabidopsis cell suspension, etiolated plants, seedlings and leaf mesophyll protoplasts, the number of identified proteins is lower than that in mammalian cells [24]. This could be because of a difference in crosslinking efficiency between plants and mammalian cells. Plant tissues contain a waxy cuticle and light acceptors in the chloroplast, which can absorb short wavelength UV light [25] and could lower the crosslinking efficiency. The leaf anatomy of the model plant Arabidopsis is different from monocot plants like wheat and rice [26–28], which may indicate the different UV efficiency for different species.
Therefore, we want to optimize the UV dosage in the monocot model Brachypodium to increase the efficiency of RIC in monocot plants (Fig. 1A). We used Brachypodium leaf mesophyll protoplasts and 2-week old seedlings as starting materials. For leaf mesophyll protoplasts, we applied the same method as previously used for the Arabidopsis leaf mesophyll protoplasts [18]. For seedlings, the UV dosage for Brachypodium was optimized in order to increase the efficiency of crosslinking (Fig. 1A). Different UV treatments and dosages were used including 0.9J/cm2, two times of 0.9J/cm2 treatments, continuous 10J/cm2, two times of 5J/cm2 treatments with turning over plants and four times of 5J/cm2 treatments (Fig. 1B). As derived from the RIN numbers obtained by Bioanalyzer analysis, none of the UV treatments or dosages resulted in RNA degradation, as the average RIN number from three independent biological replicates was above 9 for all conditions (Fig. 1C). After RIC were applied to all seedling samples with different UV treatments, the results of visualization of proteins on the silver stained SDS-PAGE gel suggested that the CL samples were enriched in proteins compared to the noCL samples (Fig. 1B). Also, a higher UV intensity treatment showed more pronounced protein bands on silver staining, which suggests a possible higher concentration of proteins (Fig. 1B).
After mass spectrometry detection and protein analysis, we identified 106 RBPs under 0.5J/cm2 UV two times exposure and 55 RBPs under 5J/cm2 UV four times exposure (Fig. 1D), of which 39 RBPs overlap (Fig. 1D). 39 RBPs showed in both UV treatment conditions and 85% overlapped proteins contains the RNA binding domains or motifs (Table S2). Surprisingly, the number of RBPs identified in the lower UV condition was higher than the number in the higher UV condition, which is not consistent with the staining intensity on the SDS-PAGE gel (Fig. 1B). One possibility is that UV overexposure can activate the repair mechanism for transcripts, which degrades RNA-protein complexes [29–31]. In this case, only the abundant RNA-protein can be captured using RIC, which is consisted with that the overlapped proteins containing the RNA binding domains or motifs.
In addition, we identified 128 RBPs in leaf mesophyll protoplasts. We pooled the RBPs from shoots in different UV dosage conditions (in total 123 RBPs), which are used to compare with the RBPs from leaf mesophyll protoplasts (Fig. 1D). There are 47 RBPs overlapped between shoots and protoplasts and 85% RBPs contain RNA binding domains or motifs. For further analysis, we pooled identified RBPs from different samples and UV treatment and in total, 203 RBPs were detected in Brachypodium (Table S2).
Conserved RNA binding domains distinguish classic from candidate RBPs
Of the 203 RBPs identified in Brachypodium, 112 proteins (55%) with known RBDs were grouped as classic RBPs. In classic RBP group, RRM domains are the largest group, followed by ribosomal proteins. The other classic RBPs include the DEAD-Helicase, COLD SHOCK domain, zf-CCCH domain, YTH domain, S1 domain, KOW domain, KH domain, zf-RNABP domain and telomerase RNA binding domain (Fig. 2). The other 91 identified protein (45%) lacking these classic domains or motifs were defined as candidate RBPs. In candidate RBPs group, the largest one contains the information in the photosynthesis, such as photosystem I reaction center subunit D and chloral A/B binding proteins (Fig. 2). However, whether these proteins are involved in RNA metabolism is unclear.
RBPs with conserved RBDs were identified in classic RBPs group and novel RBPs were found in candidate RBPs group
The detailed RBDs in both classic and candidate RBPs groups were listed in Fig. 2 and Table S2. In classic RBPs group (Fig. 3), the largest group was proteins containing RRM domains (41). In Arabidopsis, proteins with RRM domains through RIC were the largest group in canonical RBPs as well [14, 15]. We identified the classic RNA binding zinc finger proteins, such as Znf-CCCH type and Znf-RanBP2 (Fig. 3). The Znf-RanBP2 protein, also captured in Arabidopsis etiolated seedlings [15], has been reported to bind single strand RNA in human [32]. Other emerging and new RBPs in plants were identified in this research, such as two YTH-domain proteins and three alba proteins, which were also found in Arabidopsis [14, 15]. YTH domain proteins (ECT2/3) in Arabidopsis have been proved to recognize the m6A through binding to target RNA 3'UTR, and regulate the trichome morphogenesis and leaf development through facilitating the degradation of the ECT2 binding transcripts [33, 34]. Alba domain was closely related to the ancient RNA-binding IF3-C fold [35] and was accepted as multifunctional proteins that participate in genome organization, translational process, RNA metabolism [36]. 28 proteins containing YTH domain were predicted in Brachypodium through Interpro (http://www.ebi.ac.uk/interpro/entry/IPR007275/proteins-matched?taxonomy=15368), but the function has not been reported in Brachypodium.
Except for the classic RBPs, 45% of RBPs are novel RNA binding proteins in Brachypodium, for which no RNA binding function has been proposed previously. Eleven proteins involved in involved in photosystems were strongly enriched in candidate RBPs group. In addition, RBPs containing DUF, ATPase and UspA domain proteins were also identified. Actin family proteins were identified in Brachypodium, which were found in Arabidopsis etiolated seedlings as well [15].
Similar to captured RBPs through RIC in other species, enzymes were captured as well in Brachypodium, such as serine fructose-bisphosphate aldolase (FBPase) and transketolase, which are involved in the Calvin cycle for carbon fixation. Combining with the photosystem proteins, it seems the RBPs in chloroplasts are enriched in RNA binding activity. Glutamine synthetase, glycosyltransferase, pectinesterase, transferase and cellulose synthase are also captured in Brachypodium. It is very interesting to know whether these enzymes captured in this research are really binding to RNA or not.
GO and KEGG enrichment analysis
According to the biological molecular function analysis (GO analysis), the enriched proteins in classic RBPs were annotated to have RNA binding activity (p < 0.05), including mRNA binding and Structural constituent of ribosome (Fig. 3A and Table S3). The subcategories of mRNA binding were RNA binding, nucleic acid binding, organic cyclic compound binding and heterocyclic compound binding (Fig. 2A). The enriched RNA binding function among the classic RBPs demonstrated the high efficiency of this interactome capture and the accuracy of the domain search method for analysis. For the candidate RBPs, the most enriched molecular biological functions were molecular function, catalytic activity and chlorophyll binding (Fig. 2B). However, when comparing the GO enrichment in these two groups, the GO enrichment of classic group is much higher than candidate group.
In addition, the KEEG pathway enrichment was analyzed for classic and candidate RBPs (Table S3). Except for the ribosomal proteins, the most enrichment KEEG were mRNA surveillance pathway, spliceosome, RNA transport and RNA degradation. Similar to the result of GO enrichment analysis, the method to classify classic RBPs and candidate RBPs based on the domain information is efficient and accurate. For the candidate RBPs, the most enrichment in KEGG pathway were metabolic pathways, carbon metabolism, photosynthesis, glyoxylate and dicarboxylate metabolism, fructose and mannose metabolism, biosynthesis of secondary metabolites, biosynthesis of amino acids. It appears more enzymes are detected in the candidate RBPs. However, the validation for candidate RBPs are required to test their RNA binding ability.
A core RBPome identifies RBPs in plant-specific processes
RIC was first applied to the model plant Arabidopsis and RBPs were identified in Arabidopsis leaf mesophyll protoplasts (325) [18], cell suspension (913) and leaves (236) [14], and etiolated seedlings (746) [15] (Summarized in Fig. 4A). Different criteria and methods used to identify RBPs in each article make it difficult to combine these data. Comparing with all RBPs identified in Arabidopsis, the largest RBPs overlap is between leaf mesophyll protoplasts and etiolated seedlings (129 RBPs). Since shoot tissue and leaf mesophyll protoplasts were used as materials to identify RBPs in Brachypodium, it is meaningful to compare with RBPs identified in the same tissue of Arabidopsis. Thus, we pooled the RBPs data from Arabidopsis leaf and mesophyll protoplasts in order to compare with the RBPs identified in Brachypodium.
Using Inparanoid analysis to compare orthologs between Arabidopsis and Brachypodium, 10689 clusters from both species were extracted as fundamental database for further analysis. 343 protein clusters from Arabidopsis RBPs of leaf and leaf mesophyll protoplasts were matched when compared to Arabidopsis fundamental database, while 130 protein clusters of Brachypodium RBPs were matched to the Brachypodium fundamental database. When comparing the orthologs of the subsets, 57 clusters were defined, which include 82 RBPs from Arabidopsis and 62 RBPs from Brachypodium. We termed these RBPs as core RBPs (Fig. 4B and Table S4.
KEGG analysis was used to compare the core RBPs in Brachypodium and Arabidopsis (Fig. 4C and Table S4). The largest RBPs number belong to ribosomes as expected, which is followed by a carbon fixation in photosynthetic organism and carbon metabolism. These enzymes are highly conserved between Arabidopsis and Brachypodium. In addition, the proteins involved in mRNA surveillance pathway, RNA transport and spliceosome are highly conserved in both species as well. It is interesting to explore the roles of these conserved core RBPs in plants. Interestingly, RNA degradation is absent in Arabidopsis core RBPs, which may reveal the RIC method is limited to capture all RBPs in plants.
Validation methods in the plant RBPs need further optimization
To validate the candidate RBPs identified in this research, we applied silica-based solid-phase extraction followed by a western blot [28]. Firstly, we transformed the selected RBPs fused with green fluorescent protein (GFP) tag (35S: RBP: GFP) into Brachypodium leaf mesophyll protoplasts and UV treatment was applied after incubation. Silica matrix-based columns were used for nucleic acid purification, in which the crosslinked nucleic acid-protein complexes are retained. The RNA-protein interaction can be visualized using western blot with GFP-antibody to detect the GFP-tag. In order to confirm whether the silica columns capture the RNA-protein complexes, we visualized the isolated proteins from CL and noCL treated leaf mesophyll protoplasts with silver staining (Fig. 5A). The CL samples contained enriched proteins compared to noCL samples, which indicates that the combination of silica column and UV crosslinking can be used for validation. In this research, we randomly selected both classic RBPs (Bradi1g20440.1 (RRM-domain protein), Bradi2g61835 (LSM-domaim), Bradi5g17360.1 (PPR-domain)) as positive controls and candidate RBPs (Bradi4g08800.1 (carbon-fixation). As seen on the western blot (Fig. 5B), the GFP tagged proteins well expressed in protoplasts, but the target RBPs could not be detected after silica purification of the RNA-protein complexes of crosslinked samples. The possible reasons will be discussed as follows.