Complex rearrangements are still an underestimated cause of genetic diseases, and in some loci they constitute up to 30% of the pathogenic CNVs (Schuy et al. 2022). Sensitivity of the available methods for SV detection is especially limited for resolving complex SVs involving multiple chromosomal segments. This study confirms the importance of a multiomics approach and a combination of different techniques like CMA, FISH, WGS, OGM and RNASeq to fully dissect a complex chromosomal rearrangement. CMA revealed the duplications, whereas WGS/OGM allowed the refinement of the breakpoints, revealed the presence of an inversion, phasing of the multiple rearrangements in cis, and provided a framework for the proposal of genomic structure. Although the complex nature of the 9p24 SV was revealed by OGM, confirming breakpoints already detected by WGS and revealing a new one, the duplicated segments were not called, which revealed a limitation of the system. FISH was crucial to show that the duplicated segments mapped on 9p24, and also to support the proposed structure of the rearrangement, with an inversion associated with duplications. Finally, RNA-seq provided experimental evidence of chimeric KANK1/DMRT1 transcripts, and in silico AI-based predictive tools assisted in analysis of the chimeric transcript structure.
Duplication/deletions restricted to the 9p24.3 cytoband, including DOCK8 and KANK1, have been reported across multiple neurodevelopmental/psychiatric phenotypes (Capkova et al. 2021; Glessner et al. 2017). DOCK8 biallelic mutations cause a recessive condition (https://omim.org/entry/243700); its disruption in heterozygosity was identified in a few patients with mental retardation and/or seizures (Griggs et al. 2008), who were not further evaluated by the presence of additional pathogenic variants by exome analysis. This is the case for several reports of 9p24.3 CNV cases, and current data can only support a possible contribution to neurodevelopmental/psychiatric phenotypes in a multifactorial model. Therefore, an association of 9p24.3 heterozygous CNVs with clinical findings, as major variants with high impact, is still controversial. CNVs encompassing DOCK8 or KANK1 are detected in the general population at a relatively high frequency, and an eventual contribution to a congenital rare phenotype should be evaluated with caution. The absence of a neurodevelopment phenotype associated with the DOCK8/KANK1 duplication (dup1) disclosed in our family is not surprising.
Haploinsufficiency of DMRT1, 2 and 3, mainly due to 9p24.3 deletions, were already associated with disorders of the sexual development, such as ambiguous external genitalia in males, as well as gonadal dysgenesis (OMIM #154230 46XY sex reversal 4; (Muroya et al. 2000; Shan et al. 2000; Livadas et al. 2003; Quinonez et al. 2013)). In the current case, there is involvement only of the DMRT1 gene (dup2), and similar phenotypes are not present in the 9p24 SV carriers reported here. In association with the duplications and inversion, we detected a non-reference (both GRCh38 and T2T) SINE insertion at one of the breakpoints, disrupting one of the copies of the DMRT1 gene. SINE is a transposable element and its mobilization has long been associated with evolution and human diseases (Akrami and Habibi 2014; Pfaff, Singleton, and Kõks 2022). Several cases linked with SINE-VNTR-Alus rearrangements induce aberrant splicing patterns, and we cannot exclude the possibility that this insertion alters the DMRT1 expression pattern. Copy number variants overlapping the short arm of chromosome 9 were already associated with CHD (ref), implicating one or more loci in this genomic region. The genetic landscape of CHD is complex, and an interesting emerging feature is that CHD mutations often alter gene/protein dosage (Fahed et al. 2013; Simmons and Brueckner 2017; Yasuhara and Garg 2021). There are several genes mapped to the short arm of chromosome 9 that have been associated with CHD, such KANK1 (Nguyen and Lee 2022; Botos et al. 2023; Hensley et al. 2016), SMARCA2 (Lim, Foo, and Chen 2021; Wang et al. 2022), IFT74 (Bakey et al. 2023), PIGO (Krawitz et al. 2012), DNAI1 (Kennedy et al. 2007; Nakhleh et al. 2012), and NFIB (Rao and Goel 2020; Schanze et al. 2018). KANK1 and SMARCA2 are involved in the studied SV, respectively in dup1 and in the inversion.
SMARCA2 is not disrupted by the rearrangement, but it is included in the inverted segment. The haploinsufficiency of SMARCA2 causes two dominant developmental conditions, namely Blepharophimosis-impaired intellectual development syndrome (OMIM #619293) and Nicolaides-Baraitser (OMIM #601358), with other clinical signs including CHD. However, as both conditions are associated with severe syndromic intellectual disability, it is not probable that its expression is disrupted by the rearrangement.
Regarding KANK1, deletion of the paternal allele was reported in one single family to cause the condition named cerebral palsy, spastic quadriplegic 2 (OMIM #612900); however, no following studies support this association. Indeed, chromosome 9 uniparental disomy is not related to imprinted syndromes (Elbracht et al. 2020), and clinical findings in UPD(9) are commonly attributed to homozygous variants in genes related to recessive conditions or residual trisomy in mosaic. Currently, there is no clinical evidence for haploinsufficiency or triplosensitivity of KANK1 (KANK1 curation results for Dosage Sensitivity). Notwithstanding, we have found evidence in literature proposing a role for KANK1 in cardiac development (Nguyen and Lee 2022; Botos et al. 2023). KANK genes are scaffold proteins, bridging microtubules to focal adhesion sites (Botos et al. 2023; Pan et al. 2018). The Kank1 protein expression was shown to be widely distributed in various murine tissues, with relatively high levels in cardiac muscle (Nguyen and Lee 2022). In humans, the longest transcript (NM_015158) shows tissue specific expression, predominantly in heart and kidney. In addition, it was found in an injury-specific gene regulatory network in a transcriptome analysis related to cardiac regeneration in the zebrafish (Botos et al. 2023).
It is not clear how a complex SV involving three DNA segments was formed, with six breakpoints (two in each CNV) with three breakpoint junctions. At both sides flanking the dup2-dup3 breakpoint, we observed microhomologies of simple repeats composed of polyA/T sequences. However, insertion of a non-reference SINE element between dup2 and dup3 argues against non-allelic recombination caused by homology of these polyA/T sequences. Alternatively, the SINE insertion might be present in the ancestral chromosome on which the rearrangement took place or an additional event occurring after SV has been formed. It is interesting to note that the transcriptome analysis detected the presence of a chimeric transcript encompassing KANK1 and DMRT1 exons, maybe reinforcing a modified product of KANK1 as a candidate for the phenotype. The role of chimeric transcripts as cause of congenital defects is poorly explored (Zuccherato et al. 2016), in contrast to fusion transcripts commonly described as somatic events in cancer (Salokas, Dashi, and Varjosalo 2023). Only isolated cases were reported related to the detection of chimeric transcripts (gene fusions) as underlying molecular cause of developmental/neurological phenotypes (Boone et al. 2014; Ferrari et al. 2017). Recently, two studies employed an approach of detecting chimeric transcripts using RNA-seq data in rare congenital diseases, one of them with individuals with birth defects (Yamada et al. 2021; Oliver et al. 2019), leading to an increased diagnostic rate. However, in silico analysis in the current case predicted a premature stop-codon in the fusion transcript, which probably would undergo nonsense-mediated RNA decay. An eventual contribution of this fusion KANK1-DMRT1 gene to the cardiac phenotype remains to be fully explored.
Considering the recent report of ultra-long-range interactions between active regulatory elements (Friman et al. 2023), distant 9p genes with normal copy number could be misregulated due to this 9p rearrangement, which makes the derivation of genotype-to-phenotype association relationships even more complicated. In particular, the study of this SV was crucial for genetic counseling and reproductive choices of the family. Even without the identification of the precise mechanism underlying the CHD phenotype, this study identified the SV as a biomarker that was used to identify embryos at risk and select for implantation those without the CHD risk. This strategy resulted in a healthy offspring for at least one couple.