Evaluation of whole genomes is rapidly emerging as a standard diagnostic test in rare diseases. While small genetic variants such as SNPs and InDels can be identified by molecular tests and whole exome panels (WES), these technologies are limited in detection of large variants [1]. Large structural variants can have profound consequences in research of Mendelian and complex diseases but are challenging to resolve [2, 3]. Whole genome sequencing using short or long read sequencing and imaging techniques like optical genome mapping, either separately or in combination, are being increasingly used for genome-wide assessment of structural variants. Though there have been rapid improvements in these technologies over the past years, absence of a proper reference genome can hinder theirutility to the fullest potential.
The GRCh38 reference genome, while widely used for variation identification, has many unresolved sequences and gaps, contributing to about 150 megabases of genome-wide ambiguity [4]. This encompasses regions in and around centromeres, telomeres, acrocentric p-arms, collapsed, and missing sequences. Consequently, GRCh38 can lead to numerous spurious variant calls, potentially hindering certain variant identification with accuracy [5]. In contrast, the T2T-CHM13 referenceis haplotype-resolved, gapless, and error-free,and offers substantial improvements in variant detection, particularly within these problematic regions [6].
Optical genome mapping (OGM) is a high-resolution imaging technique capable of detecting structural variants (SVs) > 500 base pairs in length. This technology dubbed “Next-generation cytogenetics” is gaining traction in clinical laboratories for clinical SV identification in genetic diseases and cancers [7, 8]. However, adoption of T2T-CHM13 reference genomes for variant evaluation in OGM is still nascent in the clinical setup. To address this gap, our study aims to evaluate the utility of the T2T-CHM13 reference for OGM based SV detection in the context of genetic diseases.