The novel editing factor dsn3PLS-DYW was designed to bind specifically to the region upstream of the A. thaliana chloroplast rpoA-78691 editing site recognised by the natural editing factor CLB19. The REMSA results (Fig. 2) and the observed RNA editing in plants and bacteria expressing dsn3PLS-DYW (Figs. 3 and 5) show that this aim was achieved successfully. CLB19 and dsn3PLS-DYW differ quite considerably (only 45% sequence identity in the motifs aligned to the same nucleotides) and notably most of the residues known to be implicated in sequence recognition are different between these two proteins. CLB19 recognises one other major site in Arabidopsis chloroplasts, the clpP1-69942 originally reported in 31. This site is not detectably bound by dsn3PLS-DYW in vitro (Fig. 2) and not edited in plants or bacteria expressing dsn3PLS-DYW (Figs. 3 and 5). Hence by rational design we have succeeded in engineering an RNA editing factor that is more specific than its natural counterpart, whereas our previous attempts to achieve the same goal by modification of the natural protein achieved a far less dramatic shift in specificity 34. The fact that complementation of only the rpoA editing defect was sufficient to almost fully restore normal growth and chloroplast gene expression, at least under relatively low light conditions (Fig. 3), indicates that it is the loss of this editing event that primarily contributes to the strong phenotype of clb19 mutants. Under higher light conditions growth defects become apparent even in plants expressing dsn3PLS-DYW (in a clb19 background) (Fig. 3); we cannot be certain whether this is due to the lower degree of editing of the rpoA-78691 site than in wild-type plants or the complete lack of editing of the clpP1-69942 site.
Interestingly, clb19 plants expressing the inactivated dsn3PLS-DYW(E70A) construct were phenotypically distinguishable from untransformed clb19 (Fig. 3), suggesting that they were partially complemented, and this was confirmed by the RNA-seq data indicating a low level (0.89%) of editing of the rpoA-78691 site (Figs. 3 and 4). As this construct was not active in E. coli (Fig. 5), we presume this low level editing is due to a weak association between the dsn3PLS-DYW(E70A) protein and one of the DYW ‘donor’ proteins known to form complexes with other editing factors, such as DYW2 which forms a complex with CLB19 32,33. From an evolutionary point of view, it is noteworthy that less than 1% editing at a single site is sufficient to give rise to phenotypic differences that could provide a selective advantage. This might hint at how new editing events may arise and become selected for.
The high coverage RNA-seq that detected editing of rpoA-78691 in the dsn3PLS-DYW(E70A) samples also detected ‘off-target’ editing at numerous sites in plants expressing dsn3PLS-DYW and CLB19 (Fig. 4). Only one of these sites has been reported as an editing site previously 48, to our knowledge — the ycf3-43350 site in intron 2 of the ycf3 transcript. We do not think that that this event has any functional significance as we did not detect any significant difference in ycf3 intron 2 splicing (Supplementary Figure S4) between plants expressing CLB19 (where the ycf3-43350 is edited) or dsn3PLS-DYW (where the ycf3-43350 is not edited). This site is specifically edited by CLB19, and according to our binding predictions (Fig. 4), is an even better match to CLB19 than either rpoA-78691 or clpP1-69942, and yet is edited to a much lower extent. This may be due to a shorter half-life of the intron RNA, or because of the A at position − 1 relative to the editing site; purines at this position are known to have an inhibitory effect on editing 18,50,51 and are rare in our collection of off-target events. Other off-target events were detected at much lower levels, below 2% (Fig. 4). Perhaps surprisingly, no off-target events were detected in bacteria, despite the much greater sequence complexity of the transcriptome and therefore the higher probability of close matches to the target site occurring by chance. We suspect that there may be two explanations for this; firstly the read coverage of the chloroplast transcripts is generally much higher than for E. coli transcripts, allowing the detection of lower rates of editing, and secondly, whereas the rpoA target is a low abundance transcript in chloroplasts, the target transcript in the E. coli experiments is extremely highly expressed and may sequester a large fraction of the editing factor.
We believe that the putative off-target events in chloroplasts are true editing events catalysed by the introduced editing factors because of the statistically significant difference in the transformed lines with respect to the controls, and because they are generally consistent with expectations based on predicted binding by dsn3PLS-DYW and CLB19. The putative off-target sites for CLB19 are generally consistent with the in vitro analysis of the contribution of individual CLB19 motifs to target binding, notably the major contribution of the 2nd P1 motif (5/35 combination TD, recognising G) 34. Promisingly for applications of synthetic editing factors, editing at off-target sites of dsn3PLS-DYW did not exceed 1.5% at any of the 9 sites we detected. Although eight of these events lead to non-synonymous changes in coding sequences (Supplementary Table S1), such low amounts of editing are unlikely to be significant through any loss-of-function effect on the encoded protein. These off-target events are informative for the design of future synthetic editing factors as they provide information on the specificity of recognition (or the lack of it) of individual motifs. For example, the S2 motif aligns with an A in all 21 sites recognised by either dsn3PLS-DYW or CLB19 (Fig. 4), suggesting a hitherto unrecognised importance of this motif in determining site specificity. Other motifs proved to be less specific than expected; for example, the 5/35 combination NS, thought from previous data 26,27 to be relatively specific for C over U, did not prove to be when considering dsn3PLS-DYW off-target events, where in 12/18 cases NS motifs aligned with a U in the target (Fig. 4). This type of data will be helpful for optimising future designs.
Binding of dsn3PLS-DYW to its target sequence in vitro and editing in vivo was strongly (but not completely) dependent on the presence of MORF proteins (Figs. 2 and 5). MORF2 and MORF9 were equally able to promote editing by dsn3PLS-DYW (Fig. 5), and the evidence suggests that the PPR-MORF-RNA complex that is formed contains multiple copies of the MORF protein (Fig. 2 and Figure S3). These results are entirely consistent with those obtained with a different synthetic protein based on consensus P-L-S motifs 41. Hopefully this confirmation helps remove some of the confusion concerning the role of MORF/RIP proteins in RNA editing.
The currently favoured biotechnological tools for RNA editing, the REPAIR and RESCUE systems comprised of a base editor coupled to a deactivated Cas13, achieve a high degree of specificity via the short guide RNA complementary to the target RNA, but once bound, show significant promiscuity, resulting in undesirable off-target editing of any deaminable bases within a window of at least four nucleotides on either side of the editing site 8,9. In this study, we have demonstrated the potential of a new type of designer editing factor that can edit with high specificity. Natural PPR editing factors are extremely precise — editing almost always occurs at the 4th nucleotide 3’ of the nucleotide aligned with the S2 motif 25–27. This is also the case for dsn3PLS-DYW, as all the off-target events observed are consistent with this positioning of the editing factor relative to the edited C; indeed, no editing was detected at adjacent C residues where this would have been possible (sites 49646 and 69992). Thus it is reasonable to imagine that the specificity of ‘designer’ editing factors based on PPR-DYW scaffolds could ultimately exceed that of Cas13-ADAR fusions. On the other hand, designing the specificity of the PPR array remains complex due to the uncertain contributions of MORF cofactors and the C-terminal S2-E1-E2 motifs. Given that the Physcomitrella editing factors PPR56 and PPR65 can edit without MORF proteins present 17,18, it should be possible to design a synthetic editing factor that does not require cofactors for optimal specificity and editing activity. Of particular interest in this context are the monotypic S motif arrays found in putative editing factors in lycophytes 11, likely to be MORF-independent 6.
In conclusion, this work demonstrates the successful use of a synthetic PPR protein as an RNA editing factor, and lays the foundation for detailed structural and mechanistic studies into the mechanism of RNA editing. Designer PPR proteins represent an attractive multipurpose scaffold for targeted RNA binding, particularly as programmable RNA editing factors.