Computational screening of SRA associated with the first report of APP gencDNA.
To confirm the presence of APP gencDNA and to get a complete picture of it, we designed probe sequences based on the APP mRNA sequence and screened publicly available sequence data (SRA) computationally. We created a total of 184,506 probe sequences assuming two-base homologous recombination (Fig. 1, Supplementary Table 1) (see "Construction of probe sequences and computational screening of SRAs " in Materials and Methods). The constructed probe sequences were then used for computational screening of two runs, SRR7905478 and SRR7905479 (see "Analyzed SRAs in this study" in Materials and Methods) of BioProject PRJNA493258, which were obtained by Pacbio-sequencing of the amplicon of the nested PCR of APP in postmortem human brain and associated with the first paper reporting the presence of APP gencDNA in postmortem human brain38.
The probe sequences observed in SRR7905478 and SRR7905479 were shown in Supplementary Table 2. Thirty-eight probe sequences were positive, and various intra-exonic junction sequences were detected. These results indicated that constructed probe sequences worked well for screening intra-exonic recombinant. The screening results are summarized in Table 1. The number of probe sequence positive reads for SRR7905479 in SAD cases was 190,934 out of 254,351 total reads; for SRR7905478 in NCI cases, it was 82,346 out of 360,290 total reads. Since these SRAs were constructed after nested PCR of APP amplifying between exons 1 and 18, general normalization using housekeeping genes was not possible, and the total read count normalized these read counts. The result was 0.751 for AD cases and 0.229 for NCI cases. These results indicate that APP intra-exonic recombination in SAD cases occurs more frequently than in NCI cases and are consistent with the previous reports38.
Computational screening of SRAs constructed from genomic DNA and RNA of postmortem brains and plasma cf-mRNA.
Having confirmed that the constructed probe sequences could detect APP gencDNAs, we next analyzed six SRAs constructed from genomic DNA or mRNA obtained from postmortem brains (Table 2). In SRAs constructed from genomic DNA using exon capture instead of nested PCR, APP gencDNA was hardly detected. However, APP gencDNA was indeed detected in SRAs constructed from mRNA. These results indicate that APP gencDNAs are certainly present and are transcribed in the brain, although their abundance is low.
APP gencDNAs and their transcripts in the brain should be released extracellularly with apoptosis/necrosis, the cause of brain atrophy. And released those should emerge in peripheral blood. Therefore, we next analyzed SRA (PRJNA574438) constructed from cf-mRNA (CNA) in blood plasma (Table 3). Three hundred thirty-one probe sequences in total were detected in PRJNA574438 (Supplementary Table 3), two identical to those observed in SRR7905480 of BioProject PRJNA493258. Many probe sequence reads were detected in SRR7905478 and SRR7905479, constructed from nested PCR amplicons and associated with the paper in which APP gencDNA was first reported. Still, no positive probe sequences were common with them. On the other hand, many positive probes were common among the SRAs constructed from mRNA of the postmortem brain (Supplementary Table 4). The number of cases with positive probe sequences in PRJNA574438 was 125 of 127 for SAD and 96 of 115 for NCI.
Comparison of the number of APP gencDNA reads in an SRA from plasma cf-mRNA.
When each read count of APP gencDNA read was normalized by dividing the read count of the housekeeping gene GAPDH, significant differences were observed between SAD and NCI: p-value by the Mann-Whitney U test was 5.14x10-6 (Fig. 2a). For Aβ translation, frameshift did not occur in 202 of the 331 probe sequences (Supplementary Table 3). These reads were considered Aβ producible, but except for the probe sequence proseqff178928 positive, the number of other probe sequence positive reads was minimal. Table 4 shows the top 10 APP gencDNA-positive read counts, and Supplementary Table 3 shows all reads. Focusing on Aβ-producible reads, the average read count was 52/case for SAD and 29/case for NCI.
The most frequent probe sequence was proseqff178928, constructed as a recombinant at two bases homology region at the end of exon 14 and at the end of exon 15, which accounted for about 89% of the probe sequence positive reads. And its sequence was found to be identical to the junction sequence of exons 14 and 16 of the APP mRNA lacking exon 15, which is one of the APP isoforms and named L-APP mRNA42. L-APP mRNA is expressed in microglia and astrocytes42; it is not neuron-specific. Therefore, we conducted the Mann-Whitney U test by dividing probe sequence-positive APP gencDNA into proseqff178928-positive, L-APP, and the rest (Figs. 2b and 2c). Both groups showed significant differences between SAD and NCI: p-value 5.54x10-6 for L-APP and p-value for APP gencDNAs minus L-APP was 8.81x10-5. In addition, the Mann Whitney U test, on the groups dividing according to their ability to produce amyloid-β, still showed significant differences between SAD and NCI (Figs. 2d and 2e): p-value 6.19x10-6 for APP gencDNAs including L-APP and p-value 1.04x10-3 for excluding L-APP.
APP mRNA with exon 8 spliced out is neuron-specific43. So, we compared the number of reads for the exon 7 and exon 9 junction sequences normalizing with GAPDH between SAD and NCI. The junction sequence of exons 7 and 9 is not included in the probe sequence we constructed because it does not contain a homologous region. The p-value was 2.73x10-3, which is significant, but not a very small p-value.
NGS analysis of circulating nucleic acids in blood plasma and comparison of other SRAs
To confirm the presence and detectability of APP gencDNA in plasma, we purified CNA from our plasma samples and performed Nanopore-sequencing using PCR products amplified with primer set in exon 1 and exon 18 of the APP gene (Supplementary Table 5). Although APP gencDNA could not be detected in some samples, a variety of APP gencDNA was detected in many samples (Supplementary Table 6). This analysis using our plasma samples also detected seven identical probes to those detected in SRAs constructed from plasma cf-mRNA (CNA); 37 probe sequences were shared between the SRA constructed from plasma cf-mRNA and the SRAs constructed from mRNA from postmortem brain (Supplementary Table 4). Many probe sequences were commonly found, suggesting that APP gencDNA formation may not have occurred randomly.