Our recent study of 102 parents with candidate mosaic variants validated using amplicon NGS, ddPCR, or blocker displacement amplification (BDA) [31] revealed 27 (26.4%) as low-level mosaic (VAF percentage between 1-10%) or very low-level mosaic (VAF percentage <1%) [29]. Here, we have sought to expand the sample size of tissues from parents with suspected low-level mosaic clinically relevant SNVs or indels to determine whether whole peripheral blood is the optimal tissue to assess low-level parental somatic mosaicism. Using a customized bioinformatics pipeline, we have queried the ES database and found that approximately 3.4% of clinically relevant variants diagnosed as apparent de novo events are in fact low-level parental somatic mosaicism. This study is unique in that it is restricted to clinically relevant variants identified in a large ES dataset that meet the ACMG criteria of being pathogenic, likely pathogenic, or variant of unknown significance [32].
To date, most of the somatic mosaic variants that result in a single damaging event with a large phenotypic effect have been reported to be more common in neurodevelopmental disorders with an AD inheritance pattern [14]. Consistent with these findings, we have observed that neurodevelopmental disorders due to variants in AD trait genes, including cerebral cortical malformations, autism spectrum disorder [19], and epileptic encephalopathy [3] were in a large proportion of the study cohort (Suppl. Table 2). However, this apparent enrichment can be reflective of these phenotypes being primarily referred for trio ES testing at BG. Mosaicism in traits that are AR is rare and requires that a variant allele is inherited from one parent in addition to a de novo event occurring [33, 34].
There was no disproportionate difference between the number of clinically relevant low-level mosaic variants inherited paternally (42%) or maternally (57%). The observed ratio close to 1:1 has been observed in previous studies of somatic mosaicism [35] and contrasts with gonadal mosaicism which is skewed to paternal inheritance due to high number of divisions occurring during spermatogenesis [36].
In some disorders, it is necessary to sample for mosaicism in tissues other than blood. For example, in patients with Pallister-Killian syndrome, patch-like patterns occurring in skin may need to be sampled for tetrasomy of isochromosome 12p [37] due to mosaicism being limited at that site. In addition, clonal expansion of peripheral blood leukocytes may lead to an erroneous conclusion of an increased level of mosaicism over time. Therefore, using more sensitive and precise molecular techniques, we have measured variation in the level of mosaicism also across different somatic tissues. Analysis of low-level mosaic clinically relevant variants in five families revealed variation in VAFs across blood, buccal, saliva, urine, hair, and nails. Fluctuations in VAF percentages across all tissues samples were observed only in one mother (M1). The c.238A>T likely pathogenic variant in the USP7 gene was present at the highest VAF in samples taken from the mesoderm germ layers (blood, saliva). Samples taken from tissues in the other germ layers were observed to have more variable VAFs, with an observable variation in the VAFs in samples taken from the hair and nails which represent the ectoderm germ layer. Hair and nails tissue samples had the most outlying VAFs, which has been observed previously [29]. Hair is comprised of 95% protein and yields a small amount of DNA template which could possibly lead to a variable assessment of somatic mosaicism [38]. Extraction of high quality genomic DNA from nails can be hindered when DNA is fragmented during the keratinization process that occurs during cellular growth [39]. In three parents with low-level mosaicism, the clinically relevant variant of interest was detected in urine. Urinary sediment can have trace amounts of leukocytes, erythrocytes, and urinary epithelial cells [40]. Assessment of the degree of variation in VAF for clinically relevant low-level mosaic variants across different tissues can be useful to clinicians to determine at what stage of embryogenesis the variant arose. This in turn, may help to determine whether they might be present in germline and transmitted to progeny.
Use of NGS (at an average coverage depth of 621,899x) enabled the detection of mosaic variants with VAFs that would have been missed using standard clinical methods. Of note, sequencing at a depth of over 2000x read coverage does not provide additional information even in recent NGS platforms such as Illumina Novaseq. The error rate in amplicon NGS depends on several factors, e.g. DNA polymerase, NGS workflow used, sample handling, and the type of PCR enrichment performed, not allowing for substantial improvement of the sensitivity rate. Gambin et al. [29] observed that detection of VAFs in NGS below 1% could not be verified using ddPCR. ddPCR has been reported to have a theoretical VAF sensitivity rate of 0.001%, and our previous ddPCR experiments have been able to detect somatic mosaic variants in the FOXF1 gene at a cutoff sensitivity VAF of 0.1% [41–43]. We have utilized ddPCR in seven parents. A discrepancy between these methods was observed in four cases. VAFs below 2% detected using NGS in peripheral blood leukocytes were not identified using ddPCR. Moreover, the same variant c.923A>G in PTPN11, identified as 0.3% mosaic in amplicon NGS studies in three unrelated parents (Table 1), but was not verified by ddPCR across different tissues in parents M2 and M3 (Tables 2, 3), indicating that it may represent a technical artifact. In parent M5, 0.2% mosaicism for the c.694G>C variant in CACNA1C was detected using ddPCR in the saliva and urine samples but it was not found using amplicon NGS. These results illustrate the value of validation using multiple sensitive molecular techniques such as ddPCR and amplicon NGS for clinically relevant low-level mosaic variants.
Recently, novel techniques for detecting and precisely measuring low-level somatic mosaicism even below 0.1% have been described. BDA can reliably detect VAFs even below 0.1% [44]. MIPP-Seq that utilizes unique molecular identifiers to increase assay sensitivity can be used for measuring VAFs for SNVs and indels as low as 0.025% [45]. The use of a low number of PCR cycles along with the use of multiple independent primers that cover the variant region leads to less allelic dropout. However, these techniques can be cost prohibitive to implement.
The vast majority of the variants identified here as low-level mosaic were SNVs and only a few indels. This bias may result from low-level mosaic indels presenting a detection challenge both bioinformatically in analyzing NGS data and for designing FAM and HEX probes for ddPCR. Reads where indels occur are often filtered out during sequence alignment, which may lead to erroneous indel calling. Secondly, overlapping reads are more difficult to align and may consequently be mapped with incorporated mismatches [46]. In a mother (M5) with a COL11A1 c.3816+2dupT pathogenic variant, the mosaic insertion could only be detected in the blood through use of GATK haplotype caller, and was not found in other tissues. This finding was unexpected as the insertion was observed in 2.8% of reads generated from ES. Previous studies have found that indels occurring in the human genome are missed at a rate of 10-35% [47]. However, the rate that low-level somatic mosaic indels present in the human genome could be missed is more than 35% of the time.