We aimed to develop an easy-to-perform and quick, gain-of-signal fluorescent assay to monitor base editing activity with a plasmid-based format that allows using a number of sequences and can be easily adapted to various types of cells. The assay should report exclusively on the efficiency of base editing without being sensitive to potential indels generated by base editors. BEAR, the assay we designed in response to this demand, is based on a split GFP protein separated with the last intron of the mouse Vim gene. The sequence of the functional 5’ splice site (5’ss) is altered in such a way, that it abrogates splicing and thus GFP fluorescence, but splicing and GFP fluorescence can be restored by applying base editors (Fig. 1).
This rationale could not be used by disrupting the canonical ‘GT’ 5’ splice site either in the first position from ‘G’ to ‘A’ to be compatible with ABEs, or in the second position from ‘T’ to ‘C’ for CBEs. Both ‘AT’ and ‘GC’ splice sites are known to be functional as very rare, non-canonical splice sites in the human genome 31, as we have also demonstrated it by transfecting the plasmids with these canonical (‘GT’) and non-canonical (‘AT’ or ‘GC’) 5’ splice sites into both N2a and HEK293T cells, and measuring the number of GFP positive cells afterwards (Supplementary Fig. 1).
The flanking sequences of 5’ss modulate the efficiency of the splicing process 32. This exon-intron junction contains the 5’ NNG and 3’ RAGT flanking consensus sequences (Fig. 2c, Supplementary Fig. 2a), which have been reported to best enhance splicing 32. In order to find appropriate disrupted and edited sequence pairs which fully diminish and support splicing, respectively, we have systematically modified the other, non-targeted nucleotide of 5’ss to ‘AN’ and ‘GN’ for ABEs (Fig. 2a) and to ‘NC’ and ‘NT’ for CBEs (Fig. 2b) and/or the 5’ or 3’ flanking sequences in both the disrupted and in the pre-edited plasmids (here and throughout the manuscript pre-edited plasmids are the positive controls generated by molecular cloning to represent the maximum fluorescence that can be reached by editing). Constructs were transfected into both HEK293T and N2a cells, and the cells were analyzed by flow cytometry (Fig. 2a, b; Supplementary Fig. 2b, c; respectively). Altering only one of the bases of the 5’ss to any of the three other bases while keeping the flanking region intact was found to preserve fluorescence. Altering both bases of 5’ss or one base and either the 3’ or the 5’ flanking consensus sequence generally abrogated the fluorescent signals. When both flanking regions were altered, even the canonical ‘GT’ 5’ss sequence could be insufficient to efficiently splice and to recover GFP fluorescence. These experiments have revealed a few candidate combinations for which no detectable fluorescent signal is apparent with the disrupted splice site sequence, but it is present in case of the corresponding pre-edited sequences (Fig. 2).
Next, we have tested whether base editors can indeed recover fluorescence exploring some of the best candidate constructs identified in Figure 2 and in Supplementary Figure 2. Throughout the study, the adenine and cytosine base editors used are the codon optimized ABERA (shortened as ABE) and FNLS-CBE (shortened as CBE) variants, respectively, described by Zafra et al. 23, unless indicated otherwise. The five selected plasmids (p1, p9, p14, p15 and p24 in Fig. 2a, b) were co-transfected with ABE or CBE into both HEK293T and N2a cells, and the number of GFP positive cells were measured. In case of all selected constructs ABE and CBE could successfully recover fluorescence from 31% to 91% in HEK293T (Fig. 2c) and from 45% to 75% in N2a cells (Fig. 2d). Interestingly, both ABE and CBE can correct the disrupted 5’ss in P9 and restore GFP fluorescence, converting ‘AC’ to either ’GC’ or ‘AT’, respectively. Since both ABE and CBE reach the same levels on P9 as detected on the other best constructs (Fig. 2c, d), we could further examine both of these base editors on this common disrupted plasmid named BEAR-GFP (Fig. 1).
Figure 2e shows that fluorescence is not recovered when this construct is targeted by a single nickase or a nuclease SpCas9, supporting that the method exclusively informs about base editing. We have also found that the nuclease inactive (dead) base editor variants dABE and dCBE are also capable of correcting the 5’ss, however, with lower efficiency, as indicated by the recovered fluorescence signals of 36% and 18% for dABE and dCBE, respectively (Fig. 2e).
As an advantage, our method is not restricted to a few target sequences only. The intronic sequence between the PAM and the 3’ flanking consensus site can be varied without restrictions. This also allows to move the PAM sequence, and thus, the editing window, with respect to the base position to be edited (Supplementary Fig. 3a). Furthermore, the exonic part of the target sequence can also be altered by applying different fluorescent proteins with BEAR (Supplementary Fig. 3b) or by moving the intron’s position in the coding sequence for the protein (Supplementary Fig. 3c and d). Thus, even when the seven nucleotide-long consensus flanking sequence part of the target sequence is preserved unaltered, some tens of millions of possible different target sequences can be examined using BEAR. Since either the non-edited nucleotides of 5’ss or one of the flanking consensus sequences may also be varied (Fig. 2a, b), our method allows the targeted base to be examined in almost any sequence contexts.
To see whether the efficiency of base editing of target sequences in a plasmid or in a genomic context is governed by the same factors, we have generated stable HEK293T cell lines harboring either a disrupted GFP or a disrupted mScarlet protein, containing exactly the same exons, introns and target sequences as the BEAR plasmids have. When these cell lines were targeted by ABE and the corresponding sgRNA, fluorescence was efficiently recovered (Fig. 3a). We have compared the BEAR-GFP plasmid with the BEAR-GFP cell line, regarding their effects on the extent of fluorescence recovery, using 32 sgRNAs containing no, one or two consecutive mismatching nucleotides at different positions (Fig. 3b). The assays on the cell line and on the plasmid yielded highly similar outcomes (r=0.89), indicating that the plasmid-based assay properly mirrors the activities of ABEs on sequences in a genomic context.
To examine whether fluorescence recovery definitely results from successful base editing, we have employed one matched and one, two or three base-mismatched sgRNAs with ABE (Fig. 3c) or CBE (Fig. 3d) on the BEAR-GFP cell line, and monitored base editing activity by measuring the number of GFP positive cells, as well as by quantifying editing using EditR 33. The measured fluorescence intensity has been found to be proportional to the level of actual base editing (r=0.98). Sequencing has also revealed that in case of ABE not only the 5’ss sequence, but also a bystander adenine has been edited to a certain extent (Fig. 3e). Constructing and testing the corresponding disrupted and pre-edited plasmids has proved that editing the second, bystander ‘A’ with or without the adenine of the 5’ss sequence does not decrease or increase GFP fluorescence, respectively (Fig. 3f). In case of CBE, no bystander nucleotides have been edited, but the targeted cytosine has been converted to guanine, although to a smaller extent (Fig. 3g), as it has also been reported in case of several target sequences 22, 30, 34. By constructing the corresponding pre-edited plasmids, we have verified that the increase in fluorescence is derived from the intended editing of ‘AC’ to ‘AT’ only, without a contribution from ‘AC’ to ‘AG’ editing of 5’ss (Fig. 3h). Taken together, these data support that the BEAR method gives a faithful account of the activities of a base editor.
Increased base-editing without nicking the target DNA
Since BEAR is sensitive enough to detect the activity of nuclease inactive base editors (Fig. 2e), we have tested whether it could be used as a marker for those cells in which efficient base editing occurs, in order to increase the efficiency of base editing without intentionally nicking the DNA. We have co-transfected the BEAR-GFP plasmid with dABE and the corresponding sgRNAs into the BEAR-mScarlet cell line, and we have found that dABE has restored mScarlet fluorescence in 20% of the cells. Thirty-one percent of the cells in the transfected population exhibited mScarlet fluorescence, and 51% of the cells showed fluorescence for both mScarlet and GFP, indicating that the cells being active in processing the A-to-G base conversion on the plasmid are also efficient on the genomic DNA (Fig. 4a). We have also co-transfected the BEAR-mScarlet plasmid with dABE into the BEAR-GFP cell line. In this experiment BEAR-enrichment has increased the percentage of edited cells from 22% to 45%, highly exceeding the enrichment that we measured in the transfected population (30%; Fig. 4a).
Employing dABE and dCBE on endogenous genomic targets, namely on FANCF site 2 (Fig. 4b, c), VEGFA site 3 (Fig. 4d), and HEK site 4 (Fig. 4e), we have further tested the potential of BEAR to increase the efficiency of base editing without intentionally nicking the DNA. Using BEAR as a marker for increasing the efficiency of base editing (achieved by cell sorting of BEAR-positive cells) we have revealed an up to 12-fold enrichment for dCBE (FANCF site 2) and an up to 30-fold enrichment for dABE editing (VEGFA site 3) (Fig. 4c, d). In these experiments base editing activities have reached a maximum of 43% efficiency with dABE on VEGFA site 3 and 41% efficiency with dCBE on HEK site 4 (Fig. 4d, e). For comparison, nABE and nCBE (containing nickase Cas9) editing was monitored (without enrichment) on the same target sites (Fig. 4b-e). These experiments have indicated that BEAR facilitates base editing on genomic targets by dABE and dCBE without intentionally nicking the DNA, with efficiencies equal to or greater than that of nABE and nCBE.
On-target activity of base editors with increased fidelity SpCas9 variants
Several studies reported on CBE’s showing higher or similar mismatch tolerance compared to ABE that results in various Cas9-dependent off-target effects 10, 29, 35, 36. Applying increased fidelity variants may seem to be a plausible approach to decrease the Cas9-dependent off-target effects of base editors, however, only a few attempts of combining an increased fidelity variant with a base editor are reported in the literature 21, 29, 37, 38, 39, 40. To get a more comprehensive understanding of these effects, exploiting BEAR we have compared the activity and mismatch-tolerance of CBE and ABE containing six increased fidelity SpCas9 variants: eSpCas9, SpCas9-HF1, HypaSpCas9, Hypa-R661ASpCas9 (i.e. HypaSpCas9 which also contains the R661A mutation) evoSpCas9 and HeFSpCas9 41, 42, 43, 44, 45. Regarding that the ‘AC’ 5’ss sequence can be edited by both ABEs and CBEs in the BEAR-GFP plasmid (Fig. 1), they can be compared on the same targets, by using the same sgRNAs. Accordingly, we have compared their on-target base editing activities on 34 targets in which the 5’ss and flanking regions, as well as their distance from the PAM sequence was kept fixed and only the PAM proximal 10 nucleotides were varied. Thus, for both base editors, the sequences in their editing windows and the bases surrounding the edited bases were kept unaltered. Neighboring (+/-1) nucleotides can strongly influence the efficiency of base editing; ‘GAC’ and ‘ACA’ employed here for ABE and CBE, respectively, have been shown to be associated with medium level activities for both base editors 46. Lacking data suggesting the opposite, we have expected that the differences in the 34 target sequences (in the PAM proximal 10 nucleotides) should primarily affect the interactions between the fused SpCas9 nuclease partner of the base editors and the targets, thus this experimental design was specifically suited to study how the binding and cleavage propensities of SpCas9 variants affect the base editors’ activities.
The results illustrated in Figure 5a indicate that nABE is highly active on all 34 targets with 73% mean activity (its efficiency ranges between 62% and 89%). dABE was found to be less active with 24% mean activity. In theory, the activity profile of dABE is influenced by the sequence specificity of both the TadA deaminase and the binding of SpCas9. In contrast, the activity profile of nABE is also influenced by the nicking activity of SpCas9, which aims to bias the repair system in order to correct the mismatching bases of the unedited strand, and thus, to increase editing efficiency 7. The activity profile of dABE and nABE shown in Figure 5a indicates a weak correlation (r=0.29; Supplementary Fig. 4a), suggesting that the nicking activity of SpCas9 in nABE substantially alters the relative efficiency of nABE compared to dABE on these target sequences.
Former studies of increased fidelity SpCas9 nuclease variants have shown that these nucleases have a trade-off between efficiency and fidelity, and can be ranked according to their average activities, with evo- and HeFSpCas9 showing much lower average activities than the rest of the increased fidelity variants 45, 47, 48. Increased fidelity variants of ABE have been detected to exhibit gradually decreasing activity from ABE to HeF-ABE in our experiments, the latter showing minimal activity (4% in average), equal to the double of the background activity of nickase Cas9 (Fig. 5a), which parallels with the effect seen in case of increased fidelity nucleases. Furthermore, the activity profile of three nuclease variants, SpCas9-HF1, HypaSpCas9 and evoSpCas9 is reported to show low correlation with that of the WT-SpCas9 nuclease, while the activity profile of evoSpCas9 shows higher correlation with SpCas9-HF1 and Hypa-SpCas9 than with eSpCas9 or the WT nucleases 47. The increased fidelity ABE variants demonstrate a similar pattern, as shown in Figure 5a. The activity profiles of HF1-, Hypa- and evo-ABE show weaker correlations with nABE (r=0.46–0.54), while evo-ABE shows higher correlations with HF1- and Hypa-ABE (r=0.86 and 0.93, respectively) than with e-ABE or ABE (r=0.66 and 0.54, respectively). These findings support that the activity profiles of increased fidelity ABE variants are primarily determined by the sequence specificities of the partner SpCas9 variants (Supplementary Fig. 4a). Supplementary Figure 4c shows that of the two codon-optimized adenine base editors, ABEmax 22 has higher activity than nABE (ABERA 23), their average activities being 83% and 73%, respectively).
nCBE is less active on these 34 targets (its average editing activity is 50%), and shows higher sensitivity for sequence variations: its efficiency ranges between 26% and 69% (Fig. 5b). dCBE is considerably less active (its average activity is 12%) and its efficiency varies from 5% to 21% (Fig. 5b). Their activity profiles correlate (r=0.51, Supplementary Fig. 4b) more than the activity profiles of the ABEs, suggesting that the nicking activity has a weaker relative influence on CBE’s sequence dependence than on that of ABEs.
A decreasing effect of increased fidelity mutations from e- to evo- and HeF-CBE variants on the average activities of CBE is also evident, although this decrease is much less prominent than it is in case of the ABE variants: their average activity decreases from 50% to 22% compared to the 73% to 4% decrease seen with ABEs (Fig. 5b). The activity profile of CBE strongly correlates with those of the increased fidelity HF1-, Hypa-, and evo-CBE variants (r=0.69–0.86, Supplementary Fig. 4b), and the correlations of evo-CBE with HF-, Hypa-, e-ABE and ABE (r=0.73–0.84) are more similar to one another, which is also in contrast with the activity profiles characterizing the increased fidelity ABE (Supplementary Fig. 4a) and nuclease 47 variants. These data suggest that in strong contrast to ABEs, the activity profiles of the CBE variants are more determined by factors other than the properties of the increased fidelity SpCas9 nucleases, presumably by the sequence specificity of the deaminase partner and the subsequent repair process. The differing activities of the CBE variants across the 34 targets also suggest that in case of CBE, rather than in case of ABE, these latter factors are affected more by the sequence features of the PAM proximal 10 nucleotides.
We have also examined the three CBEs whose sequences have been codon optimized by two independent research groups 22, 23, and found that CBE (FNLS-CBE) exhibits higher average activities than BE4max or AncBE4max (50% vs. 40% and 39% respectively) (Supplementary Fig. 4d).
Mismatch tolerance of ABE, CBE and their increased fidelity variants
We have compared the mismatch tolerance of ABE (Fig. 6a) and CBE (Fig. 6b) with their increased fidelity variants employing 50 mismatching sgRNAs (Target 1 from Fig. 5) in which the positions of one to five consecutive mismatches have systematically varied along the full length of each sgRNA. Examining ABE, we have found that it tolerates sgRNAs containing one or two mismatches in all the positions examined, with an average of 71% and 37% normalized activity, respectively (Supplementary Fig. 5a). Regarding dABE, it exhibits slightly higher fidelity, which is more apparent with the sgRNAs containing two mismatching positions (normalized average activity: 15%). The mismatching profiles of nABE and dABE show a strong correlation (r=0.88, Supplementary Fig. 5b), which is interesting since the off-target effects of the active and inactive forms of SpCas9 have been reported to differ considerably 49. Based on this consideration, we expected a weaker correlation, similarly to the correlation between the on-target activities of nABE and dABE.
Regarding the increased fidelity ABE variants, five of them have been tested on the BEAR-GFP plasmid (Target 1 from Fig. 5), employing the same 50 mismatching sgRNAs. HeF-ABE was excluded from these experiments due to its low on-target activity. Increased fidelity mutations were found to decrease the mismatch tolerance of ABE (Fig. 6a). The fidelity of the same SpCas9 nuclease variants have been reported to increase in a great extent from eSpCas9 to evo- and HeFSpCas9 45, 47, 48. Remarkably, these fidelity increases are also evident in the mismatch tolerance of the ABE variants when sgRNAs mismatching in one position are employed (Fig. 6a, Supplementary Fig. 5a). In contrast, with almost all sgRNAs containing two or more mismatches, each increased fidelity ABE variant has been found to exhibit only background-level activities. Interestingly, increased fidelity ABE variants exhibit higher specificities on this target than dABE (Fig. 6a).
Regarding the mismatch tolerance of the CBE variants, tested using the same 50 mismatching sgRNAs (Fig. 6b), we have found that nCBE tolerates one or two mismatches in all the positions examined, with an average normalized activity of 100% and 61% when the sgRNAs include mismatches in one or two positions, respectively (Supplementary Fig. 5c). In turn, dCBE exhibits slightly higher fidelity, which is more apparent with the sgRNAs containing two mismatching positions (normalized average activity: 44%). The mismatching profiles of nCBE and dCBE show a strong correlation (r=0.87, Supplementary Fig. 5d) which is similar to that seen with nABE and dABE (Supplementary Fig. 5b).
Regarding the increased fidelity CBE variants, all the six of them have reached sufficiently high on-target activity on the BEAR-GFP plasmid, thus all six have been investigated with the previous set of 50 mismatching sgRNAs. Interestingly, although an overall increase in specificity towards the highest fidelity evo- and HeF-CBE has been evident (Fig. 6b), this effect is much less prominent than it is in case of the increased fidelity ABE variants (Fig. 6a). Compared to ABEs, increased fidelity CBE variants exhibit lower specificity. Specifically, while the ABE variants show target specificity resembling the background, characterized by 4–6% and 2–5% of normalized average activity with sgRNAs mismatching in two or three positions, respectively (Supplementary Fig. 5a), the CBE variants exhibit 16–27% and 6–12% of normalized average activity with the respective mismatching sgRNAs (Supplementary Fig. 5c)
Next, we have tested the mismatch tolerance of ABEmax and xABE (which contains the nickase version of xSpCas9 50) along with the previously used increased fidelity ABE variants on target 7 with the same set of 50 mismatching sgRNAs (Supplementary Fig. 6a). Compared to ABE, ABEmax has been found to show lower specificity, but their mismatch profiles are nearly identical (r=0.96, Supplementary Fig. 6b). Regarding xABE, which is effective on targets with loosened NG-like PAMs and is also reported to possess increased fidelity, it has been found to exhibit slightly higher specificities than nABE, but it is characterized by the lowest specificity among all the variants examined on this target, including dABE. Its mismatch profile seems to be different from that of the increased fidelity ABE variants (r=0.14-0.38, Supplementary Fig. 6b), while all four increased fidelity ABE variants show strong correlations with each other in their mismatch tolerance profile (r=0.84–0.93, Supplementary Fig. 6b). A similar difference between the activities of xSpCas9 and the other increased fidelity nuclease variants has recently been reported in two studies 47, 48.
To see whether these observations are specific to the target examined or are more general characteristics of these base editor variants, we have investigated the mismatch tolerance of ABE and CBE variants on another three targets (targets 2, 6 and 17; Supplementary Fig. 7a-c) using the same approach. These experiments have confirmed the conclusions drawn from our previous findings shown in Figure 6 and Supplementary Figure 6. Namely, (i) CBE is more tolerant to mismatches than ABE is, although ABE also shows a considerable target-dependent mismatch tolerance. (ii) Their activity profiles investigated using the same set of mismatching sgRNAs show strong correlations (r=0.93–0.96), arguing that their mismatch tolerances are primarily influenced by the specificities of SpCas9 cleavage activity as seen also with target 1 in Figure 6 earlier. (iii) The effect of variants from higher positions of fidelity ranking of increased fidelity SpCas9s is more prominent in case of the increased fidelity ABE variants than in case of the CBE variants, indicating that increased fidelity mutations decrease the mismatch tolerance of ABE more effectively than that of CBE.